Jekyll tutorial

简单来说,Jekyll 是一款能将标记语言文本(例如 markdown 文件)转换为静态网页的软件。本博客系列即是借助 Github Pages 这一平台,使用 Jekyll 来完成的。仅在此记录一下初次接触时的笔记。

安装

首先需要理清相关的概念。Ruby 是一种脚本语言,而 Jekyll 是基于该语言编写的一个包,因此要运行 Jekyll 需要有一个 Ruby 的解释器,并且安装好相关的依赖。

Ruby 的包及包管理工具

Ruby 语言中包的呈现形式一般会是一个 .gem 文件,RubyGems 是安装 Ruby 时自带的一个包管理工具,用于将源代码打包为 .gem 文件,在 shell 中使用 gem 命令即为 RubyGems 工具。

Rails

Rails 是一套基于 Ruby 的著名开发框架,Jekyll 也是基于此框架编写的。Rails 使用一个叫做 Bundle 的包管理工具,可以简单视为对原生 RubyGems 的一层封装,在使用 Bundle 进行包管理时,可以将一个项目的全部依赖关系写入到一个文本文件中(默认文件名为 Gemfile),这样就可以方便地使用 Bundle 进行依赖包的安装了。

Ruby 类比 Python

Ruby Python 说明
Ruby Python 都是脚本语言
RubyGems、Bundle pip 包管理工具
Gemfile requirements.txt 依赖包列表
Rails Flask Rails 是基于 Ruby 的框架,Flask 是基于 Python 的框架

安装过程

方法一 (WSL2 上强烈不推荐): 参照 jekyll 官网 的安装步骤。

方法二 (推荐): 避免安装系统级 Ruby

参考:

RVM 应该相当于是 anaconda, 可以安装多个版本的 Ruby.

# 安装 RVM: http://rvm.io/
gpg2 --keyserver keyserver.ubuntu.com --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3 7D2BAF1CF37B13E2069D6956105BD0E739499BDB
curl -sSL https://get.rvm.io | bash -s stable

# 参考: https://stackoverflow.com/questions/75452016/installation-messed-up-with-ruby-unable-to-install-jekyll
rvm install 2.7
rvm use 2.7.2 --default
rvm -v
rvm gemset update
gem install jekyll -v 4.2.1
jekyll -v
cd /path/to/username.github.io

# 根据 Gemfile 安装依赖与运行
bundle install
bundle exec jekyll serve

gem 换源:

gem sources --add https://gems.ruby-china.com/ --remove https://rubygems.org/
gem sources -l

GitHub Pages

在 Jekyll 安装好后,可以简单地使用如下命令生成一个 “HelloWorld” 静态网页。

# 新建一个名为 my-awesome-site 文件夹,并在这里面生成了一些文件
jekyll new my-awesome-site
cd my-awesome-site
bundle exec jekyll serve
# 使用浏览器打开 http://localhost:4000 即可看到网页内容

因此,至此为止,已经可以制作网页了,并且可以进行本地的浏览。如果有自己的服务器和域名的话,就可以让其他人也看到了,如果自己没有服务器或者域名的话,GitHub 网站的 Github Pages 功能则相当于提供了一个免费的服务器及域名。为了做到这一点,首先需要有一个 GitHub 账号,假定账号名为 foo,即进入自己主页后,其域名为 https://github.com/foo。之后,需要新建一个名为 foo.github.io 的仓库,之后将前面的 my-awesome-site 文件夹下的所有文件直接拷贝至这个仓库中,将代码提交到 Github 后,等待几分钟后,就可以用浏览器打开 https://foo.github.io,就可以浏览到生成的网页了。

Jekyll 项目的目录结构

_data/
_includes/
_layout/
_post/
_asset/
_sass/
_site/
_config.html
  • _site:Jekyll 转换后的结果,即 bundle exec jekyll serve 命令的输出

  • _data 文件夹:用于添加全局变量,例如在 _data 文件夹下建立一个名为 navigation.yml 的文件,那么 site.data.navigation 就指代的是这个文件里的数据,可以在 Liquid 模板语言中被引用

  • _includes 文件夹:例如建立一个名为 navigation.html 的文件,那么它可以被 _layout 中的例如 default.html 用如下方式引入

    {% include navigation.html %}
    
  • _layout 文件夹下的文件 default.html,可以使用在其他文件中,只要开头包含

    ---
    layout: default
    ---
    

    例如 _post/2018-08-20-bananas.md 的文件内容除了上述三行为

    aaa
    bbb
    

    那么经过 Jekyll 转换后的 _site/2018/08/20/bananas.html 会是 _layout/default.htmlcontent 替换为上述两行的结果,即自动转为

    <p>aaa</p>
    <p>bbb</p>
    
  • _sass:并非必要,例如 _sass/main.scss 可以被 assets 下的文件 assets/css/styles.scss 使用如下方式引入

    ---
    ---
    @import "main";
    
  • assets:目录结构固定为

    assets/
      - css/
      - images/
      - js/
    

    _site 中体现在 _site/assets 文件夹中

  • _posts:文件命名固定为 YYYY-MM-DD-title.{ext},例如 _post/2010-09-03-bananas.md。最终由 Jekyll 生成的 html 文件路径为 _site/YYYY/MM/DD/title.html

  • 主目录下的 index.html 将会被映射为网站的 / 目录,映射关系如下

    ROOT
      - index.html  # -> _site/index.html -> ip:port/
      - about.html  # -> _site/index.html -> ip:port/about
      - hello.html  # -> _site/index.html -> ip:port/hello
    

预备知识:Liquid 模板语言

Liquid 官方文档

layout 的继承关系

_layout/default.html

<!doctype html>
<html>
  <head>
    <meta charset="utf-8">
    <title>(LTS) Jekyll tutorial</title>
  </head>
  <body>
    <article class="post h-entry" itemscope itemtype="http://schema.org/BlogPosting">

  <header class="post-header">
    <h1 class="post-title p-name" itemprop="name headline">(LTS) Misc</h1>
    <p class="post-meta">
      <time class="dt-published" datetime="2024-09-21T03:00:00+00:00" itemprop="datePublished">Sep 21, 2024
      </time></p>
  </header>

  <div class="post-content e-content" itemprop="articleBody">
    <div style="position: fixed; padding: 1em; right: 0; top: 0; width: 10%; height: 80%; overflow: auto;">
      <div id="toc"></div>
    </div>
    <h1 id="lc_-环境变量"><code class="language-plaintext highlighter-rouge">LC_*</code> 环境变量</h1>

<p>参考资料:</p>

<ul>
  <li><a href="https://sites.ualberta.ca/dept/chemeng/AIX-43/share/man/info/C/a_doc_lib/aixbman/baseadmn/locale_env.htm">https://sites.ualberta.ca/dept/chemeng/AIX-43/share/man/info/C/a_doc_lib/aixbman/baseadmn/locale_env.htm</a></li>
  <li><a href="https://www.ibm.com/docs/en/aix/7.3?topic=locales-understanding-locale-environment-variables">https://www.ibm.com/docs/en/aix/7.3?topic=locales-understanding-locale-environment-variables</a></li>
</ul>

<p><code class="language-plaintext highlighter-rouge">LC_*</code> 环境变量用于设置区域信息, 主要包括这些:</p>

<p>高优先级:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LC_ALL</code>: 设置此值时, 则会覆盖其余 <code class="language-plaintext highlighter-rouge">LC_*</code> 的取值</li>
  <li><code class="language-plaintext highlighter-rouge">LC_COLLATE</code>: 影响字符的排序规则</li>
  <li><code class="language-plaintext highlighter-rouge">LC_CTYPE</code>: 影响字符分类(字母,数字,符号等)以及字符集的范围以及对应的字节表示</li>
</ul>

<p>中优先级:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LC_MESSAGES</code>: 控制程序显示的提示和错误信息的语言</li>
  <li><code class="language-plaintext highlighter-rouge">LC_MONETARY</code>: 控制货币的符号以及货币符号的位置</li>
  <li><code class="language-plaintext highlighter-rouge">LC_NUMERIC</code>: 控制数字的输出格式(例如每三位用逗号隔开)</li>
  <li><code class="language-plaintext highlighter-rouge">LC_TIME</code>: 控制日期显示格式</li>
</ul>

<p>低优先级:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LANG</code>: 当 <code class="language-plaintext highlighter-rouge">LC_ALL</code> 未被设置时, 且 <code class="language-plaintext highlighter-rouge">LC_*</code> 变量未设置时, 那么 <code class="language-plaintext highlighter-rouge">LC_*</code> 变量将使用 <code class="language-plaintext highlighter-rouge">LANG</code> 的取值</li>
</ul>

<p>以上这些变量的常见取值有</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">C</code>, <code class="language-plaintext highlighter-rouge">POSIX</code>: 这两者完全等价, 字符集仅包括 ASCII, 是最简单, 与区域无关的默认值, 设置此值时在 C 语言环境下可保证一致行为, 也是操作系统启动时的默认值</li>
  <li><code class="language-plaintext highlighter-rouge">C.UTF-8</code>: 对 <code class="language-plaintext highlighter-rouge">C</code> 的扩展, 主要是扩展字符集为 UTF-8, 但与区域无关. 当希望在 C 语言环境下保证一致行为, 且希望采用 UTF-8 字符集时, 推荐采用此值.</li>
  <li><code class="language-plaintext highlighter-rouge">en_US.utf8</code>: 字符集为 UTF-8, 日期、时间、货币和其他格式符合美国习惯</li>
  <li><code class="language-plaintext highlighter-rouge">zh_CN.utf8</code>: 字符集为 UTF-8, 日期、时间、货币和其他格式符合中国习惯</li>
</ul>

<p>上述优先级的设定可以用下面的 python 代码示意:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">keys</span> <span class="o">=</span> <span class="p">[</span><span class="s">"LC_COLLATE"</span><span class="p">,</span> <span class="s">"LC_CTYPE"</span><span class="p">,</span> <span class="s">"LC_MESSAGES"</span><span class="p">,</span> <span class="s">"LC_MONETARY"</span><span class="p">,</span> <span class="s">"LC_NUMERIC"</span><span class="p">,</span> <span class="s">"LC_TIME"</span><span class="p">]</span>
<span class="n">lc_vars</span> <span class="o">=</span> <span class="p">{</span><span class="n">key</span><span class="p">:</span> <span class="s">"C"</span> <span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">keys</span><span class="p">}</span>  <span class="c1"># C 是默认值
</span><span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">keys</span><span class="p">:</span>
    <span class="k">if</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LC_ALL"</span><span class="p">,</span> <span class="s">""</span><span class="p">):</span>
        <span class="n">lc_vars</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LC_ALL"</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
    <span class="k">elif</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="s">""</span><span class="p">):</span>
        <span class="n">lc_vars</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
    <span class="k">elif</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LANG"</span><span class="p">,</span> <span class="s">""</span><span class="p">):</span>
        <span class="n">lc_vars</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LANG"</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
</code></pre></div></div>

<p>除了上述变量以外, 使用 <code class="language-plaintext highlighter-rouge">locale</code> 命令, 还会看到这些变量:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LANGUAGE</code> 主要用于翻译设置的优先级, 例如: <code class="language-plaintext highlighter-rouge">LANGUAGE=fr:en:de</code>, 表明翻译优先级由高到低为: 法语,英语,德语</li>
  <li><code class="language-plaintext highlighter-rouge">LC_PAPER</code>: 纸张大小</li>
  <li><code class="language-plaintext highlighter-rouge">LC_NAME</code>: 人名的书写格式</li>
  <li><code class="language-plaintext highlighter-rouge">LC_ADDRESS</code>: 地址的书写格式</li>
  <li><code class="language-plaintext highlighter-rouge">LC_TELEPHONE</code>: 电话号码格式</li>
  <li><code class="language-plaintext highlighter-rouge">LC_MEASUREMENT</code>: 度量衡</li>
  <li><code class="language-plaintext highlighter-rouge">LC_IDENTIFICATION</code>: 特定标识?</li>
</ul>

<p><strong>所以, 比较省事的做法是直接设置 <code class="language-plaintext highlighter-rouge">LC_ALL</code> 变量. 最后, 以上所有的变量以及优先级仅对严格遵循 POSIX 标准的应用程序有效. 并且这些变量通常是在“软件本地化”(国际化:i18n 和本地化:l10n)的场景下才会用到</strong></p>

<p>以下是一个关于 <code class="language-plaintext highlighter-rouge">LC_TIME</code> 的示例</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">locale</span>
<span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span>

<span class="c1"># 可以尝试设置为: C, C.UTF-8, en_US.UTF-8, zh_CN.UTF-8
</span><span class="n">locale</span><span class="p">.</span><span class="n">setlocale</span><span class="p">(</span><span class="n">locale</span><span class="p">.</span><span class="n">LC_TIME</span><span class="p">,</span> <span class="s">'C'</span><span class="p">)</span>

<span class="c1"># 获取并打印当前系统的日期格式
</span><span class="n">current_locale</span> <span class="o">=</span> <span class="n">locale</span><span class="p">.</span><span class="n">getlocale</span><span class="p">(</span><span class="n">locale</span><span class="p">.</span><span class="n">LC_TIME</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"当前 LC_TIME 本地化设置:"</span><span class="p">,</span> <span class="n">current_locale</span><span class="p">)</span>

<span class="c1"># 打印当前时间, 注意: 这里用的格式化方式是 %c
</span><span class="n">now</span> <span class="o">=</span> <span class="n">datetime</span><span class="p">.</span><span class="n">now</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="s">"当前时间:"</span><span class="p">,</span> <span class="n">now</span><span class="p">.</span><span class="n">strftime</span><span class="p">(</span><span class="s">'%c'</span><span class="p">))</span>
</code></pre></div></div>

<p>输出:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 设置为 C 或 POSIX 时
当前 LC_TIME 本地化设置: (None, None)
当前时间: Sun Sep 22 11:22:24 2024

# 设置为 en_US.UTF-8 时
当前 LC_TIME 本地化设置: ('en_US', 'UTF-8')
当前时间: Sun 22 Sep 2024 11:22:51 AM

# 设置为 zh_CN.UTF-8 时
当前 LC_TIME 本地化设置: ('zh_CN', 'UTF-8')
当前时间: 2024年09月22日 星期日 10时59分10秒

# 设置为 C.UTF-8 时, 不同的环境配置可能有所不同
当前 LC_TIME 本地化设置: ('en_US', 'UTF-8')
当前时间: Sun 22 Sep 2024 11:22:51 AM
</code></pre></div></div>

<p>(1) 注意: 如果出现 <code class="language-plaintext highlighter-rouge">locale.Error: unsupported locale setting</code> 这种报错, 可以使用 locale 命令进行检查和添加:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># 查看系统中支持的语言环境</span>
locale <span class="nt">-a</span>

<span class="c"># 安装语言环境</span>
<span class="nb">sudo </span>locale-gen zh_CN.UTF-8

<span class="c"># 查看所有LC相关的环境变量的设置情况</span>
locale

<span class="c"># Debian/Ubuntu 系统 LC 相关变量的默认值设置文件: /etc/default/locale</span>
<span class="c"># 修改后可以 source 或重新登录使其生效</span>

<span class="c"># 安装/删除语言环境</span>
<span class="c"># 可以直接修改 /etc/locale.gen 文件, 然后执行</span>
<span class="nb">sudo </span>locale-gen
</code></pre></div></div>

<p>(2) 注意: 当设置为 <code class="language-plaintext highlighter-rouge">C.UTF-8</code> 时, 显示的值可能会与系统默认的 <code class="language-plaintext highlighter-rouge">LC_*</code> 变量有关</p>

<h1 id="little-endian-vs-big-endian">Little Endian vs Big Endian</h1>

<p>在 C 语言的虚拟内存中, 假设有一个 32 位的整数数组, 用来存储: <code class="language-plaintext highlighter-rouge">[1, 2, 3]</code>, 假设数组的起始地址是 <code class="language-plaintext highlighter-rouge">0x100</code>, 那么无论是 Little Endian 还是 Big Endian, <code class="language-plaintext highlighter-rouge">0x100, 0x101, 0x102, 0x103</code> 这 4 个字节用于存储 1, <code class="language-plaintext highlighter-rouge">0x104, 0x105, 0x106, 0x107</code> 这 4 个字节用于存储 2, <code class="language-plaintext highlighter-rouge">0x108, 0x109, 0x10A, 0x10B</code> 这 4 个字节用于存储 3, 另外在一个字节内部, 总是高位在前, 低位在后</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0x100, 0x101, 0x102, 0x103, 0x104, 0x105, 0x106, 0x107, 0x108, 0x109, 0x10A, 0x10B
# Big Endian (按比特值)
00000000, 00000000, 00000000, 00000001
00000000, 00000000, 00000000, 00000002
00000000, 00000000, 00000000, 00000003
# Little Endian (按比特值)
00000001, 00000000, 00000000, 00000000
00000002, 00000000, 00000000, 00000000
00000003, 00000000, 00000000, 00000000
</code></pre></div></div>

<p>在文件的场景下, 文件在进行网络传输或是U盘拷贝时,其字节顺序不会发生变化(拷贝和传输时,并不知道文件中哪些字节应该组合在一起形成有意义的内容,因此这些过程字节顺序只能是原封不动的). 如何理解文件中的字节由写入和读取程序.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">struct</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">"x.dat"</span><span class="p">,</span> <span class="s">"wb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">fw</span><span class="p">:</span>
    <span class="n">fw</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"&lt;I"</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>  <span class="c1"># b'\x01\x00\x00\x00'
</span>    <span class="n">fw</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"&gt;I"</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span>  <span class="c1"># b'\x00\x00\x00\x02'
</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">"x.dat"</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">fr</span><span class="p">:</span>
    <span class="n">x</span> <span class="o">=</span> <span class="n">fr</span><span class="p">.</span><span class="n">read</span><span class="p">(</span><span class="mi">8</span><span class="p">)</span>  <span class="c1"># b'\x01\x00\x00\x00\x00\x00\x00\x02'
</span> 
<span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">"&lt;I"</span><span class="p">,</span> <span class="n">x</span><span class="p">[:</span><span class="mi">4</span><span class="p">])</span>  <span class="c1"># (1,)
</span><span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">"&gt;I"</span><span class="p">,</span> <span class="n">x</span><span class="p">[</span><span class="mi">4</span><span class="p">:])</span>  <span class="c1"># (2,)
</span></code></pre></div></div>

<p>在上面的例子中, 实际上是写入程序与读取程序约定了如下协议: 文件中包含两个 int32 的数字, 第一个数字采用 Little Endian, 第二个数字采用 Big Endian.</p>

<p><code class="language-plaintext highlighter-rouge">numpy.ndarray.newbyteorder</code></p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># S 表示使用相反的字节序
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'S'</span><span class="p">)</span>  <span class="c1"># array([256, 512], dtype=int16)
# = 和 I 表示使用相同的字节序
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'='</span><span class="p">)</span>  <span class="c1"># array([1, 2], dtype=int16)
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'I'</span><span class="p">)</span>  <span class="c1"># array([1, 2], dtype=int16)
</span>
<span class="c1"># 以下两个的输出与系统有关, 下面假设系统默认的是 Little Endian
</span><span class="kn">import</span> <span class="nn">sys</span>
<span class="n">sys</span><span class="p">.</span><span class="n">byteorder</span>  <span class="c1"># 'little'
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'&lt;'</span><span class="p">)</span>  <span class="c1"># array([1, 2], dtype=int16)
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'&gt;'</span><span class="p">)</span>  <span class="c1"># array([256, 512], dtype=int16)
</span></code></pre></div></div>

  </div>

  <script>
    tocbot.init({
      tocSelector: '#toc',
      contentSelector: '.post-content.e-content',
      headingSelector: 'h1, h2, h3, h4, h5',
      hasInnerContainers: true,
      collapseDepth: 3
    });
  </script><a class="u-url" href="/2024/09/21/misc.html" hidden></a>
</article>

  </body>
</html>

_layout/post.html

---
layout: default
---
<h1>(LTS) Jekyll tutorial</h1>
<p>12 Dec 2021 - </p>

<article class="post h-entry" itemscope itemtype="http://schema.org/BlogPosting">

  <header class="post-header">
    <h1 class="post-title p-name" itemprop="name headline">(LTS) Misc</h1>
    <p class="post-meta">
      <time class="dt-published" datetime="2024-09-21T03:00:00+00:00" itemprop="datePublished">Sep 21, 2024
      </time></p>
  </header>

  <div class="post-content e-content" itemprop="articleBody">
    <div style="position: fixed; padding: 1em; right: 0; top: 0; width: 10%; height: 80%; overflow: auto;">
      <div id="toc"></div>
    </div>
    <h1 id="lc_-环境变量"><code class="language-plaintext highlighter-rouge">LC_*</code> 环境变量</h1>

<p>参考资料:</p>

<ul>
  <li><a href="https://sites.ualberta.ca/dept/chemeng/AIX-43/share/man/info/C/a_doc_lib/aixbman/baseadmn/locale_env.htm">https://sites.ualberta.ca/dept/chemeng/AIX-43/share/man/info/C/a_doc_lib/aixbman/baseadmn/locale_env.htm</a></li>
  <li><a href="https://www.ibm.com/docs/en/aix/7.3?topic=locales-understanding-locale-environment-variables">https://www.ibm.com/docs/en/aix/7.3?topic=locales-understanding-locale-environment-variables</a></li>
</ul>

<p><code class="language-plaintext highlighter-rouge">LC_*</code> 环境变量用于设置区域信息, 主要包括这些:</p>

<p>高优先级:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LC_ALL</code>: 设置此值时, 则会覆盖其余 <code class="language-plaintext highlighter-rouge">LC_*</code> 的取值</li>
  <li><code class="language-plaintext highlighter-rouge">LC_COLLATE</code>: 影响字符的排序规则</li>
  <li><code class="language-plaintext highlighter-rouge">LC_CTYPE</code>: 影响字符分类(字母,数字,符号等)以及字符集的范围以及对应的字节表示</li>
</ul>

<p>中优先级:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LC_MESSAGES</code>: 控制程序显示的提示和错误信息的语言</li>
  <li><code class="language-plaintext highlighter-rouge">LC_MONETARY</code>: 控制货币的符号以及货币符号的位置</li>
  <li><code class="language-plaintext highlighter-rouge">LC_NUMERIC</code>: 控制数字的输出格式(例如每三位用逗号隔开)</li>
  <li><code class="language-plaintext highlighter-rouge">LC_TIME</code>: 控制日期显示格式</li>
</ul>

<p>低优先级:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LANG</code>: 当 <code class="language-plaintext highlighter-rouge">LC_ALL</code> 未被设置时, 且 <code class="language-plaintext highlighter-rouge">LC_*</code> 变量未设置时, 那么 <code class="language-plaintext highlighter-rouge">LC_*</code> 变量将使用 <code class="language-plaintext highlighter-rouge">LANG</code> 的取值</li>
</ul>

<p>以上这些变量的常见取值有</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">C</code>, <code class="language-plaintext highlighter-rouge">POSIX</code>: 这两者完全等价, 字符集仅包括 ASCII, 是最简单, 与区域无关的默认值, 设置此值时在 C 语言环境下可保证一致行为, 也是操作系统启动时的默认值</li>
  <li><code class="language-plaintext highlighter-rouge">C.UTF-8</code>: 对 <code class="language-plaintext highlighter-rouge">C</code> 的扩展, 主要是扩展字符集为 UTF-8, 但与区域无关. 当希望在 C 语言环境下保证一致行为, 且希望采用 UTF-8 字符集时, 推荐采用此值.</li>
  <li><code class="language-plaintext highlighter-rouge">en_US.utf8</code>: 字符集为 UTF-8, 日期、时间、货币和其他格式符合美国习惯</li>
  <li><code class="language-plaintext highlighter-rouge">zh_CN.utf8</code>: 字符集为 UTF-8, 日期、时间、货币和其他格式符合中国习惯</li>
</ul>

<p>上述优先级的设定可以用下面的 python 代码示意:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">keys</span> <span class="o">=</span> <span class="p">[</span><span class="s">"LC_COLLATE"</span><span class="p">,</span> <span class="s">"LC_CTYPE"</span><span class="p">,</span> <span class="s">"LC_MESSAGES"</span><span class="p">,</span> <span class="s">"LC_MONETARY"</span><span class="p">,</span> <span class="s">"LC_NUMERIC"</span><span class="p">,</span> <span class="s">"LC_TIME"</span><span class="p">]</span>
<span class="n">lc_vars</span> <span class="o">=</span> <span class="p">{</span><span class="n">key</span><span class="p">:</span> <span class="s">"C"</span> <span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">keys</span><span class="p">}</span>  <span class="c1"># C 是默认值
</span><span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">keys</span><span class="p">:</span>
    <span class="k">if</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LC_ALL"</span><span class="p">,</span> <span class="s">""</span><span class="p">):</span>
        <span class="n">lc_vars</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LC_ALL"</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
    <span class="k">elif</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="s">""</span><span class="p">):</span>
        <span class="n">lc_vars</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
    <span class="k">elif</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LANG"</span><span class="p">,</span> <span class="s">""</span><span class="p">):</span>
        <span class="n">lc_vars</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LANG"</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
</code></pre></div></div>

<p>除了上述变量以外, 使用 <code class="language-plaintext highlighter-rouge">locale</code> 命令, 还会看到这些变量:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LANGUAGE</code> 主要用于翻译设置的优先级, 例如: <code class="language-plaintext highlighter-rouge">LANGUAGE=fr:en:de</code>, 表明翻译优先级由高到低为: 法语,英语,德语</li>
  <li><code class="language-plaintext highlighter-rouge">LC_PAPER</code>: 纸张大小</li>
  <li><code class="language-plaintext highlighter-rouge">LC_NAME</code>: 人名的书写格式</li>
  <li><code class="language-plaintext highlighter-rouge">LC_ADDRESS</code>: 地址的书写格式</li>
  <li><code class="language-plaintext highlighter-rouge">LC_TELEPHONE</code>: 电话号码格式</li>
  <li><code class="language-plaintext highlighter-rouge">LC_MEASUREMENT</code>: 度量衡</li>
  <li><code class="language-plaintext highlighter-rouge">LC_IDENTIFICATION</code>: 特定标识?</li>
</ul>

<p><strong>所以, 比较省事的做法是直接设置 <code class="language-plaintext highlighter-rouge">LC_ALL</code> 变量. 最后, 以上所有的变量以及优先级仅对严格遵循 POSIX 标准的应用程序有效. 并且这些变量通常是在“软件本地化”(国际化:i18n 和本地化:l10n)的场景下才会用到</strong></p>

<p>以下是一个关于 <code class="language-plaintext highlighter-rouge">LC_TIME</code> 的示例</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">locale</span>
<span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span>

<span class="c1"># 可以尝试设置为: C, C.UTF-8, en_US.UTF-8, zh_CN.UTF-8
</span><span class="n">locale</span><span class="p">.</span><span class="n">setlocale</span><span class="p">(</span><span class="n">locale</span><span class="p">.</span><span class="n">LC_TIME</span><span class="p">,</span> <span class="s">'C'</span><span class="p">)</span>

<span class="c1"># 获取并打印当前系统的日期格式
</span><span class="n">current_locale</span> <span class="o">=</span> <span class="n">locale</span><span class="p">.</span><span class="n">getlocale</span><span class="p">(</span><span class="n">locale</span><span class="p">.</span><span class="n">LC_TIME</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"当前 LC_TIME 本地化设置:"</span><span class="p">,</span> <span class="n">current_locale</span><span class="p">)</span>

<span class="c1"># 打印当前时间, 注意: 这里用的格式化方式是 %c
</span><span class="n">now</span> <span class="o">=</span> <span class="n">datetime</span><span class="p">.</span><span class="n">now</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="s">"当前时间:"</span><span class="p">,</span> <span class="n">now</span><span class="p">.</span><span class="n">strftime</span><span class="p">(</span><span class="s">'%c'</span><span class="p">))</span>
</code></pre></div></div>

<p>输出:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 设置为 C 或 POSIX 时
当前 LC_TIME 本地化设置: (None, None)
当前时间: Sun Sep 22 11:22:24 2024

# 设置为 en_US.UTF-8 时
当前 LC_TIME 本地化设置: ('en_US', 'UTF-8')
当前时间: Sun 22 Sep 2024 11:22:51 AM

# 设置为 zh_CN.UTF-8 时
当前 LC_TIME 本地化设置: ('zh_CN', 'UTF-8')
当前时间: 2024年09月22日 星期日 10时59分10秒

# 设置为 C.UTF-8 时, 不同的环境配置可能有所不同
当前 LC_TIME 本地化设置: ('en_US', 'UTF-8')
当前时间: Sun 22 Sep 2024 11:22:51 AM
</code></pre></div></div>

<p>(1) 注意: 如果出现 <code class="language-plaintext highlighter-rouge">locale.Error: unsupported locale setting</code> 这种报错, 可以使用 locale 命令进行检查和添加:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># 查看系统中支持的语言环境</span>
locale <span class="nt">-a</span>

<span class="c"># 安装语言环境</span>
<span class="nb">sudo </span>locale-gen zh_CN.UTF-8

<span class="c"># 查看所有LC相关的环境变量的设置情况</span>
locale

<span class="c"># Debian/Ubuntu 系统 LC 相关变量的默认值设置文件: /etc/default/locale</span>
<span class="c"># 修改后可以 source 或重新登录使其生效</span>

<span class="c"># 安装/删除语言环境</span>
<span class="c"># 可以直接修改 /etc/locale.gen 文件, 然后执行</span>
<span class="nb">sudo </span>locale-gen
</code></pre></div></div>

<p>(2) 注意: 当设置为 <code class="language-plaintext highlighter-rouge">C.UTF-8</code> 时, 显示的值可能会与系统默认的 <code class="language-plaintext highlighter-rouge">LC_*</code> 变量有关</p>

<h1 id="little-endian-vs-big-endian">Little Endian vs Big Endian</h1>

<p>在 C 语言的虚拟内存中, 假设有一个 32 位的整数数组, 用来存储: <code class="language-plaintext highlighter-rouge">[1, 2, 3]</code>, 假设数组的起始地址是 <code class="language-plaintext highlighter-rouge">0x100</code>, 那么无论是 Little Endian 还是 Big Endian, <code class="language-plaintext highlighter-rouge">0x100, 0x101, 0x102, 0x103</code> 这 4 个字节用于存储 1, <code class="language-plaintext highlighter-rouge">0x104, 0x105, 0x106, 0x107</code> 这 4 个字节用于存储 2, <code class="language-plaintext highlighter-rouge">0x108, 0x109, 0x10A, 0x10B</code> 这 4 个字节用于存储 3, 另外在一个字节内部, 总是高位在前, 低位在后</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0x100, 0x101, 0x102, 0x103, 0x104, 0x105, 0x106, 0x107, 0x108, 0x109, 0x10A, 0x10B
# Big Endian (按比特值)
00000000, 00000000, 00000000, 00000001
00000000, 00000000, 00000000, 00000002
00000000, 00000000, 00000000, 00000003
# Little Endian (按比特值)
00000001, 00000000, 00000000, 00000000
00000002, 00000000, 00000000, 00000000
00000003, 00000000, 00000000, 00000000
</code></pre></div></div>

<p>在文件的场景下, 文件在进行网络传输或是U盘拷贝时,其字节顺序不会发生变化(拷贝和传输时,并不知道文件中哪些字节应该组合在一起形成有意义的内容,因此这些过程字节顺序只能是原封不动的). 如何理解文件中的字节由写入和读取程序.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">struct</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">"x.dat"</span><span class="p">,</span> <span class="s">"wb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">fw</span><span class="p">:</span>
    <span class="n">fw</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"&lt;I"</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>  <span class="c1"># b'\x01\x00\x00\x00'
</span>    <span class="n">fw</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"&gt;I"</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span>  <span class="c1"># b'\x00\x00\x00\x02'
</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">"x.dat"</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">fr</span><span class="p">:</span>
    <span class="n">x</span> <span class="o">=</span> <span class="n">fr</span><span class="p">.</span><span class="n">read</span><span class="p">(</span><span class="mi">8</span><span class="p">)</span>  <span class="c1"># b'\x01\x00\x00\x00\x00\x00\x00\x02'
</span> 
<span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">"&lt;I"</span><span class="p">,</span> <span class="n">x</span><span class="p">[:</span><span class="mi">4</span><span class="p">])</span>  <span class="c1"># (1,)
</span><span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">"&gt;I"</span><span class="p">,</span> <span class="n">x</span><span class="p">[</span><span class="mi">4</span><span class="p">:])</span>  <span class="c1"># (2,)
</span></code></pre></div></div>

<p>在上面的例子中, 实际上是写入程序与读取程序约定了如下协议: 文件中包含两个 int32 的数字, 第一个数字采用 Little Endian, 第二个数字采用 Big Endian.</p>

<p><code class="language-plaintext highlighter-rouge">numpy.ndarray.newbyteorder</code></p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># S 表示使用相反的字节序
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'S'</span><span class="p">)</span>  <span class="c1"># array([256, 512], dtype=int16)
# = 和 I 表示使用相同的字节序
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'='</span><span class="p">)</span>  <span class="c1"># array([1, 2], dtype=int16)
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'I'</span><span class="p">)</span>  <span class="c1"># array([1, 2], dtype=int16)
</span>
<span class="c1"># 以下两个的输出与系统有关, 下面假设系统默认的是 Little Endian
</span><span class="kn">import</span> <span class="nn">sys</span>
<span class="n">sys</span><span class="p">.</span><span class="n">byteorder</span>  <span class="c1"># 'little'
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'&lt;'</span><span class="p">)</span>  <span class="c1"># array([1, 2], dtype=int16)
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'&gt;'</span><span class="p">)</span>  <span class="c1"># array([256, 512], dtype=int16)
</span></code></pre></div></div>

  </div>

  <script>
    tocbot.init({
      tocSelector: '#toc',
      contentSelector: '.post-content.e-content',
      headingSelector: 'h1, h2, h3, h4, h5',
      hasInnerContainers: true,
      collapseDepth: 3
    });
  </script><a class="u-url" href="/2024/09/21/misc.html" hidden></a>
</article>

_post/2018-08-20-bananas.md

---
layout: post
author: jill
---
first paragraph

second paragraph

生成的 _post/2018/08/20/bananas.html 文件内容如下:

<!doctype html>
<html>
  <head>
    <meta charset="utf-8">
    <title>Bananas</title>
  </head>
  <body>
    <h1>Bananas</h1>
    <p>20 Aug 2018 - jill</p>
    <p>first paragraph</p>
    <p>second paragraph</p>
  </body>
</html>

解释:继承关系使得 _layout/post.html 实际被替换为

<!doctype html>
<html>
  <head>
    <meta charset="utf-8">
    <title>(LTS) Jekyll tutorial</title>
  </head>
  <body>
    <!--由于继承关系, 将post的实际内容替换掉default.html的<article class="post h-entry" itemscope itemtype="http://schema.org/BlogPosting">

  <header class="post-header">
    <h1 class="post-title p-name" itemprop="name headline">(LTS) Misc</h1>
    <p class="post-meta">
      <time class="dt-published" datetime="2024-09-21T03:00:00+00:00" itemprop="datePublished">Sep 21, 2024
      </time></p>
  </header>

  <div class="post-content e-content" itemprop="articleBody">
    <div style="position: fixed; padding: 1em; right: 0; top: 0; width: 10%; height: 80%; overflow: auto;">
      <div id="toc"></div>
    </div>
    <h1 id="lc_-环境变量"><code class="language-plaintext highlighter-rouge">LC_*</code> 环境变量</h1>

<p>参考资料:</p>

<ul>
  <li><a href="https://sites.ualberta.ca/dept/chemeng/AIX-43/share/man/info/C/a_doc_lib/aixbman/baseadmn/locale_env.htm">https://sites.ualberta.ca/dept/chemeng/AIX-43/share/man/info/C/a_doc_lib/aixbman/baseadmn/locale_env.htm</a></li>
  <li><a href="https://www.ibm.com/docs/en/aix/7.3?topic=locales-understanding-locale-environment-variables">https://www.ibm.com/docs/en/aix/7.3?topic=locales-understanding-locale-environment-variables</a></li>
</ul>

<p><code class="language-plaintext highlighter-rouge">LC_*</code> 环境变量用于设置区域信息, 主要包括这些:</p>

<p>高优先级:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LC_ALL</code>: 设置此值时, 则会覆盖其余 <code class="language-plaintext highlighter-rouge">LC_*</code> 的取值</li>
  <li><code class="language-plaintext highlighter-rouge">LC_COLLATE</code>: 影响字符的排序规则</li>
  <li><code class="language-plaintext highlighter-rouge">LC_CTYPE</code>: 影响字符分类(字母,数字,符号等)以及字符集的范围以及对应的字节表示</li>
</ul>

<p>中优先级:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LC_MESSAGES</code>: 控制程序显示的提示和错误信息的语言</li>
  <li><code class="language-plaintext highlighter-rouge">LC_MONETARY</code>: 控制货币的符号以及货币符号的位置</li>
  <li><code class="language-plaintext highlighter-rouge">LC_NUMERIC</code>: 控制数字的输出格式(例如每三位用逗号隔开)</li>
  <li><code class="language-plaintext highlighter-rouge">LC_TIME</code>: 控制日期显示格式</li>
</ul>

<p>低优先级:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LANG</code>: 当 <code class="language-plaintext highlighter-rouge">LC_ALL</code> 未被设置时, 且 <code class="language-plaintext highlighter-rouge">LC_*</code> 变量未设置时, 那么 <code class="language-plaintext highlighter-rouge">LC_*</code> 变量将使用 <code class="language-plaintext highlighter-rouge">LANG</code> 的取值</li>
</ul>

<p>以上这些变量的常见取值有</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">C</code>, <code class="language-plaintext highlighter-rouge">POSIX</code>: 这两者完全等价, 字符集仅包括 ASCII, 是最简单, 与区域无关的默认值, 设置此值时在 C 语言环境下可保证一致行为, 也是操作系统启动时的默认值</li>
  <li><code class="language-plaintext highlighter-rouge">C.UTF-8</code>: 对 <code class="language-plaintext highlighter-rouge">C</code> 的扩展, 主要是扩展字符集为 UTF-8, 但与区域无关. 当希望在 C 语言环境下保证一致行为, 且希望采用 UTF-8 字符集时, 推荐采用此值.</li>
  <li><code class="language-plaintext highlighter-rouge">en_US.utf8</code>: 字符集为 UTF-8, 日期、时间、货币和其他格式符合美国习惯</li>
  <li><code class="language-plaintext highlighter-rouge">zh_CN.utf8</code>: 字符集为 UTF-8, 日期、时间、货币和其他格式符合中国习惯</li>
</ul>

<p>上述优先级的设定可以用下面的 python 代码示意:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">keys</span> <span class="o">=</span> <span class="p">[</span><span class="s">"LC_COLLATE"</span><span class="p">,</span> <span class="s">"LC_CTYPE"</span><span class="p">,</span> <span class="s">"LC_MESSAGES"</span><span class="p">,</span> <span class="s">"LC_MONETARY"</span><span class="p">,</span> <span class="s">"LC_NUMERIC"</span><span class="p">,</span> <span class="s">"LC_TIME"</span><span class="p">]</span>
<span class="n">lc_vars</span> <span class="o">=</span> <span class="p">{</span><span class="n">key</span><span class="p">:</span> <span class="s">"C"</span> <span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">keys</span><span class="p">}</span>  <span class="c1"># C 是默认值
</span><span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">keys</span><span class="p">:</span>
    <span class="k">if</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LC_ALL"</span><span class="p">,</span> <span class="s">""</span><span class="p">):</span>
        <span class="n">lc_vars</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LC_ALL"</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
    <span class="k">elif</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="s">""</span><span class="p">):</span>
        <span class="n">lc_vars</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
    <span class="k">elif</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LANG"</span><span class="p">,</span> <span class="s">""</span><span class="p">):</span>
        <span class="n">lc_vars</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LANG"</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
</code></pre></div></div>

<p>除了上述变量以外, 使用 <code class="language-plaintext highlighter-rouge">locale</code> 命令, 还会看到这些变量:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LANGUAGE</code> 主要用于翻译设置的优先级, 例如: <code class="language-plaintext highlighter-rouge">LANGUAGE=fr:en:de</code>, 表明翻译优先级由高到低为: 法语,英语,德语</li>
  <li><code class="language-plaintext highlighter-rouge">LC_PAPER</code>: 纸张大小</li>
  <li><code class="language-plaintext highlighter-rouge">LC_NAME</code>: 人名的书写格式</li>
  <li><code class="language-plaintext highlighter-rouge">LC_ADDRESS</code>: 地址的书写格式</li>
  <li><code class="language-plaintext highlighter-rouge">LC_TELEPHONE</code>: 电话号码格式</li>
  <li><code class="language-plaintext highlighter-rouge">LC_MEASUREMENT</code>: 度量衡</li>
  <li><code class="language-plaintext highlighter-rouge">LC_IDENTIFICATION</code>: 特定标识?</li>
</ul>

<p><strong>所以, 比较省事的做法是直接设置 <code class="language-plaintext highlighter-rouge">LC_ALL</code> 变量. 最后, 以上所有的变量以及优先级仅对严格遵循 POSIX 标准的应用程序有效. 并且这些变量通常是在“软件本地化”(国际化:i18n 和本地化:l10n)的场景下才会用到</strong></p>

<p>以下是一个关于 <code class="language-plaintext highlighter-rouge">LC_TIME</code> 的示例</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">locale</span>
<span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span>

<span class="c1"># 可以尝试设置为: C, C.UTF-8, en_US.UTF-8, zh_CN.UTF-8
</span><span class="n">locale</span><span class="p">.</span><span class="n">setlocale</span><span class="p">(</span><span class="n">locale</span><span class="p">.</span><span class="n">LC_TIME</span><span class="p">,</span> <span class="s">'C'</span><span class="p">)</span>

<span class="c1"># 获取并打印当前系统的日期格式
</span><span class="n">current_locale</span> <span class="o">=</span> <span class="n">locale</span><span class="p">.</span><span class="n">getlocale</span><span class="p">(</span><span class="n">locale</span><span class="p">.</span><span class="n">LC_TIME</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"当前 LC_TIME 本地化设置:"</span><span class="p">,</span> <span class="n">current_locale</span><span class="p">)</span>

<span class="c1"># 打印当前时间, 注意: 这里用的格式化方式是 %c
</span><span class="n">now</span> <span class="o">=</span> <span class="n">datetime</span><span class="p">.</span><span class="n">now</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="s">"当前时间:"</span><span class="p">,</span> <span class="n">now</span><span class="p">.</span><span class="n">strftime</span><span class="p">(</span><span class="s">'%c'</span><span class="p">))</span>
</code></pre></div></div>

<p>输出:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 设置为 C 或 POSIX 时
当前 LC_TIME 本地化设置: (None, None)
当前时间: Sun Sep 22 11:22:24 2024

# 设置为 en_US.UTF-8 时
当前 LC_TIME 本地化设置: ('en_US', 'UTF-8')
当前时间: Sun 22 Sep 2024 11:22:51 AM

# 设置为 zh_CN.UTF-8 时
当前 LC_TIME 本地化设置: ('zh_CN', 'UTF-8')
当前时间: 2024年09月22日 星期日 10时59分10秒

# 设置为 C.UTF-8 时, 不同的环境配置可能有所不同
当前 LC_TIME 本地化设置: ('en_US', 'UTF-8')
当前时间: Sun 22 Sep 2024 11:22:51 AM
</code></pre></div></div>

<p>(1) 注意: 如果出现 <code class="language-plaintext highlighter-rouge">locale.Error: unsupported locale setting</code> 这种报错, 可以使用 locale 命令进行检查和添加:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># 查看系统中支持的语言环境</span>
locale <span class="nt">-a</span>

<span class="c"># 安装语言环境</span>
<span class="nb">sudo </span>locale-gen zh_CN.UTF-8

<span class="c"># 查看所有LC相关的环境变量的设置情况</span>
locale

<span class="c"># Debian/Ubuntu 系统 LC 相关变量的默认值设置文件: /etc/default/locale</span>
<span class="c"># 修改后可以 source 或重新登录使其生效</span>

<span class="c"># 安装/删除语言环境</span>
<span class="c"># 可以直接修改 /etc/locale.gen 文件, 然后执行</span>
<span class="nb">sudo </span>locale-gen
</code></pre></div></div>

<p>(2) 注意: 当设置为 <code class="language-plaintext highlighter-rouge">C.UTF-8</code> 时, 显示的值可能会与系统默认的 <code class="language-plaintext highlighter-rouge">LC_*</code> 变量有关</p>

<h1 id="little-endian-vs-big-endian">Little Endian vs Big Endian</h1>

<p>在 C 语言的虚拟内存中, 假设有一个 32 位的整数数组, 用来存储: <code class="language-plaintext highlighter-rouge">[1, 2, 3]</code>, 假设数组的起始地址是 <code class="language-plaintext highlighter-rouge">0x100</code>, 那么无论是 Little Endian 还是 Big Endian, <code class="language-plaintext highlighter-rouge">0x100, 0x101, 0x102, 0x103</code> 这 4 个字节用于存储 1, <code class="language-plaintext highlighter-rouge">0x104, 0x105, 0x106, 0x107</code> 这 4 个字节用于存储 2, <code class="language-plaintext highlighter-rouge">0x108, 0x109, 0x10A, 0x10B</code> 这 4 个字节用于存储 3, 另外在一个字节内部, 总是高位在前, 低位在后</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0x100, 0x101, 0x102, 0x103, 0x104, 0x105, 0x106, 0x107, 0x108, 0x109, 0x10A, 0x10B
# Big Endian (按比特值)
00000000, 00000000, 00000000, 00000001
00000000, 00000000, 00000000, 00000002
00000000, 00000000, 00000000, 00000003
# Little Endian (按比特值)
00000001, 00000000, 00000000, 00000000
00000002, 00000000, 00000000, 00000000
00000003, 00000000, 00000000, 00000000
</code></pre></div></div>

<p>在文件的场景下, 文件在进行网络传输或是U盘拷贝时,其字节顺序不会发生变化(拷贝和传输时,并不知道文件中哪些字节应该组合在一起形成有意义的内容,因此这些过程字节顺序只能是原封不动的). 如何理解文件中的字节由写入和读取程序.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">struct</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">"x.dat"</span><span class="p">,</span> <span class="s">"wb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">fw</span><span class="p">:</span>
    <span class="n">fw</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"&lt;I"</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>  <span class="c1"># b'\x01\x00\x00\x00'
</span>    <span class="n">fw</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"&gt;I"</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span>  <span class="c1"># b'\x00\x00\x00\x02'
</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">"x.dat"</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">fr</span><span class="p">:</span>
    <span class="n">x</span> <span class="o">=</span> <span class="n">fr</span><span class="p">.</span><span class="n">read</span><span class="p">(</span><span class="mi">8</span><span class="p">)</span>  <span class="c1"># b'\x01\x00\x00\x00\x00\x00\x00\x02'
</span> 
<span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">"&lt;I"</span><span class="p">,</span> <span class="n">x</span><span class="p">[:</span><span class="mi">4</span><span class="p">])</span>  <span class="c1"># (1,)
</span><span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">"&gt;I"</span><span class="p">,</span> <span class="n">x</span><span class="p">[</span><span class="mi">4</span><span class="p">:])</span>  <span class="c1"># (2,)
</span></code></pre></div></div>

<p>在上面的例子中, 实际上是写入程序与读取程序约定了如下协议: 文件中包含两个 int32 的数字, 第一个数字采用 Little Endian, 第二个数字采用 Big Endian.</p>

<p><code class="language-plaintext highlighter-rouge">numpy.ndarray.newbyteorder</code></p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># S 表示使用相反的字节序
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'S'</span><span class="p">)</span>  <span class="c1"># array([256, 512], dtype=int16)
# = 和 I 表示使用相同的字节序
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'='</span><span class="p">)</span>  <span class="c1"># array([1, 2], dtype=int16)
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'I'</span><span class="p">)</span>  <span class="c1"># array([1, 2], dtype=int16)
</span>
<span class="c1"># 以下两个的输出与系统有关, 下面假设系统默认的是 Little Endian
</span><span class="kn">import</span> <span class="nn">sys</span>
<span class="n">sys</span><span class="p">.</span><span class="n">byteorder</span>  <span class="c1"># 'little'
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'&lt;'</span><span class="p">)</span>  <span class="c1"># array([1, 2], dtype=int16)
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'&gt;'</span><span class="p">)</span>  <span class="c1"># array([256, 512], dtype=int16)
</span></code></pre></div></div>

  </div>

  <script>
    tocbot.init({
      tocSelector: '#toc',
      contentSelector: '.post-content.e-content',
      headingSelector: 'h1, h2, h3, h4, h5',
      hasInnerContainers: true,
      collapseDepth: 3
    });
  </script><a class="u-url" href="/2024/09/21/misc.html" hidden></a>
</article>
-->
    <h1>(LTS) Jekyll tutorial</h1>
	<p>12 Dec 2021 - </p>
	<article class="post h-entry" itemscope itemtype="http://schema.org/BlogPosting">

  <header class="post-header">
    <h1 class="post-title p-name" itemprop="name headline">(LTS) Misc</h1>
    <p class="post-meta">
      <time class="dt-published" datetime="2024-09-21T03:00:00+00:00" itemprop="datePublished">Sep 21, 2024
      </time></p>
  </header>

  <div class="post-content e-content" itemprop="articleBody">
    <div style="position: fixed; padding: 1em; right: 0; top: 0; width: 10%; height: 80%; overflow: auto;">
      <div id="toc"></div>
    </div>
    <h1 id="lc_-环境变量"><code class="language-plaintext highlighter-rouge">LC_*</code> 环境变量</h1>

<p>参考资料:</p>

<ul>
  <li><a href="https://sites.ualberta.ca/dept/chemeng/AIX-43/share/man/info/C/a_doc_lib/aixbman/baseadmn/locale_env.htm">https://sites.ualberta.ca/dept/chemeng/AIX-43/share/man/info/C/a_doc_lib/aixbman/baseadmn/locale_env.htm</a></li>
  <li><a href="https://www.ibm.com/docs/en/aix/7.3?topic=locales-understanding-locale-environment-variables">https://www.ibm.com/docs/en/aix/7.3?topic=locales-understanding-locale-environment-variables</a></li>
</ul>

<p><code class="language-plaintext highlighter-rouge">LC_*</code> 环境变量用于设置区域信息, 主要包括这些:</p>

<p>高优先级:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LC_ALL</code>: 设置此值时, 则会覆盖其余 <code class="language-plaintext highlighter-rouge">LC_*</code> 的取值</li>
  <li><code class="language-plaintext highlighter-rouge">LC_COLLATE</code>: 影响字符的排序规则</li>
  <li><code class="language-plaintext highlighter-rouge">LC_CTYPE</code>: 影响字符分类(字母,数字,符号等)以及字符集的范围以及对应的字节表示</li>
</ul>

<p>中优先级:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LC_MESSAGES</code>: 控制程序显示的提示和错误信息的语言</li>
  <li><code class="language-plaintext highlighter-rouge">LC_MONETARY</code>: 控制货币的符号以及货币符号的位置</li>
  <li><code class="language-plaintext highlighter-rouge">LC_NUMERIC</code>: 控制数字的输出格式(例如每三位用逗号隔开)</li>
  <li><code class="language-plaintext highlighter-rouge">LC_TIME</code>: 控制日期显示格式</li>
</ul>

<p>低优先级:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LANG</code>: 当 <code class="language-plaintext highlighter-rouge">LC_ALL</code> 未被设置时, 且 <code class="language-plaintext highlighter-rouge">LC_*</code> 变量未设置时, 那么 <code class="language-plaintext highlighter-rouge">LC_*</code> 变量将使用 <code class="language-plaintext highlighter-rouge">LANG</code> 的取值</li>
</ul>

<p>以上这些变量的常见取值有</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">C</code>, <code class="language-plaintext highlighter-rouge">POSIX</code>: 这两者完全等价, 字符集仅包括 ASCII, 是最简单, 与区域无关的默认值, 设置此值时在 C 语言环境下可保证一致行为, 也是操作系统启动时的默认值</li>
  <li><code class="language-plaintext highlighter-rouge">C.UTF-8</code>: 对 <code class="language-plaintext highlighter-rouge">C</code> 的扩展, 主要是扩展字符集为 UTF-8, 但与区域无关. 当希望在 C 语言环境下保证一致行为, 且希望采用 UTF-8 字符集时, 推荐采用此值.</li>
  <li><code class="language-plaintext highlighter-rouge">en_US.utf8</code>: 字符集为 UTF-8, 日期、时间、货币和其他格式符合美国习惯</li>
  <li><code class="language-plaintext highlighter-rouge">zh_CN.utf8</code>: 字符集为 UTF-8, 日期、时间、货币和其他格式符合中国习惯</li>
</ul>

<p>上述优先级的设定可以用下面的 python 代码示意:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">keys</span> <span class="o">=</span> <span class="p">[</span><span class="s">"LC_COLLATE"</span><span class="p">,</span> <span class="s">"LC_CTYPE"</span><span class="p">,</span> <span class="s">"LC_MESSAGES"</span><span class="p">,</span> <span class="s">"LC_MONETARY"</span><span class="p">,</span> <span class="s">"LC_NUMERIC"</span><span class="p">,</span> <span class="s">"LC_TIME"</span><span class="p">]</span>
<span class="n">lc_vars</span> <span class="o">=</span> <span class="p">{</span><span class="n">key</span><span class="p">:</span> <span class="s">"C"</span> <span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">keys</span><span class="p">}</span>  <span class="c1"># C 是默认值
</span><span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">keys</span><span class="p">:</span>
    <span class="k">if</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LC_ALL"</span><span class="p">,</span> <span class="s">""</span><span class="p">):</span>
        <span class="n">lc_vars</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LC_ALL"</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
    <span class="k">elif</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="s">""</span><span class="p">):</span>
        <span class="n">lc_vars</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
    <span class="k">elif</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LANG"</span><span class="p">,</span> <span class="s">""</span><span class="p">):</span>
        <span class="n">lc_vars</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"LANG"</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
</code></pre></div></div>

<p>除了上述变量以外, 使用 <code class="language-plaintext highlighter-rouge">locale</code> 命令, 还会看到这些变量:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">LANGUAGE</code> 主要用于翻译设置的优先级, 例如: <code class="language-plaintext highlighter-rouge">LANGUAGE=fr:en:de</code>, 表明翻译优先级由高到低为: 法语,英语,德语</li>
  <li><code class="language-plaintext highlighter-rouge">LC_PAPER</code>: 纸张大小</li>
  <li><code class="language-plaintext highlighter-rouge">LC_NAME</code>: 人名的书写格式</li>
  <li><code class="language-plaintext highlighter-rouge">LC_ADDRESS</code>: 地址的书写格式</li>
  <li><code class="language-plaintext highlighter-rouge">LC_TELEPHONE</code>: 电话号码格式</li>
  <li><code class="language-plaintext highlighter-rouge">LC_MEASUREMENT</code>: 度量衡</li>
  <li><code class="language-plaintext highlighter-rouge">LC_IDENTIFICATION</code>: 特定标识?</li>
</ul>

<p><strong>所以, 比较省事的做法是直接设置 <code class="language-plaintext highlighter-rouge">LC_ALL</code> 变量. 最后, 以上所有的变量以及优先级仅对严格遵循 POSIX 标准的应用程序有效. 并且这些变量通常是在“软件本地化”(国际化:i18n 和本地化:l10n)的场景下才会用到</strong></p>

<p>以下是一个关于 <code class="language-plaintext highlighter-rouge">LC_TIME</code> 的示例</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">locale</span>
<span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span>

<span class="c1"># 可以尝试设置为: C, C.UTF-8, en_US.UTF-8, zh_CN.UTF-8
</span><span class="n">locale</span><span class="p">.</span><span class="n">setlocale</span><span class="p">(</span><span class="n">locale</span><span class="p">.</span><span class="n">LC_TIME</span><span class="p">,</span> <span class="s">'C'</span><span class="p">)</span>

<span class="c1"># 获取并打印当前系统的日期格式
</span><span class="n">current_locale</span> <span class="o">=</span> <span class="n">locale</span><span class="p">.</span><span class="n">getlocale</span><span class="p">(</span><span class="n">locale</span><span class="p">.</span><span class="n">LC_TIME</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"当前 LC_TIME 本地化设置:"</span><span class="p">,</span> <span class="n">current_locale</span><span class="p">)</span>

<span class="c1"># 打印当前时间, 注意: 这里用的格式化方式是 %c
</span><span class="n">now</span> <span class="o">=</span> <span class="n">datetime</span><span class="p">.</span><span class="n">now</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="s">"当前时间:"</span><span class="p">,</span> <span class="n">now</span><span class="p">.</span><span class="n">strftime</span><span class="p">(</span><span class="s">'%c'</span><span class="p">))</span>
</code></pre></div></div>

<p>输出:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 设置为 C 或 POSIX 时
当前 LC_TIME 本地化设置: (None, None)
当前时间: Sun Sep 22 11:22:24 2024

# 设置为 en_US.UTF-8 时
当前 LC_TIME 本地化设置: ('en_US', 'UTF-8')
当前时间: Sun 22 Sep 2024 11:22:51 AM

# 设置为 zh_CN.UTF-8 时
当前 LC_TIME 本地化设置: ('zh_CN', 'UTF-8')
当前时间: 2024年09月22日 星期日 10时59分10秒

# 设置为 C.UTF-8 时, 不同的环境配置可能有所不同
当前 LC_TIME 本地化设置: ('en_US', 'UTF-8')
当前时间: Sun 22 Sep 2024 11:22:51 AM
</code></pre></div></div>

<p>(1) 注意: 如果出现 <code class="language-plaintext highlighter-rouge">locale.Error: unsupported locale setting</code> 这种报错, 可以使用 locale 命令进行检查和添加:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># 查看系统中支持的语言环境</span>
locale <span class="nt">-a</span>

<span class="c"># 安装语言环境</span>
<span class="nb">sudo </span>locale-gen zh_CN.UTF-8

<span class="c"># 查看所有LC相关的环境变量的设置情况</span>
locale

<span class="c"># Debian/Ubuntu 系统 LC 相关变量的默认值设置文件: /etc/default/locale</span>
<span class="c"># 修改后可以 source 或重新登录使其生效</span>

<span class="c"># 安装/删除语言环境</span>
<span class="c"># 可以直接修改 /etc/locale.gen 文件, 然后执行</span>
<span class="nb">sudo </span>locale-gen
</code></pre></div></div>

<p>(2) 注意: 当设置为 <code class="language-plaintext highlighter-rouge">C.UTF-8</code> 时, 显示的值可能会与系统默认的 <code class="language-plaintext highlighter-rouge">LC_*</code> 变量有关</p>

<h1 id="little-endian-vs-big-endian">Little Endian vs Big Endian</h1>

<p>在 C 语言的虚拟内存中, 假设有一个 32 位的整数数组, 用来存储: <code class="language-plaintext highlighter-rouge">[1, 2, 3]</code>, 假设数组的起始地址是 <code class="language-plaintext highlighter-rouge">0x100</code>, 那么无论是 Little Endian 还是 Big Endian, <code class="language-plaintext highlighter-rouge">0x100, 0x101, 0x102, 0x103</code> 这 4 个字节用于存储 1, <code class="language-plaintext highlighter-rouge">0x104, 0x105, 0x106, 0x107</code> 这 4 个字节用于存储 2, <code class="language-plaintext highlighter-rouge">0x108, 0x109, 0x10A, 0x10B</code> 这 4 个字节用于存储 3, 另外在一个字节内部, 总是高位在前, 低位在后</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0x100, 0x101, 0x102, 0x103, 0x104, 0x105, 0x106, 0x107, 0x108, 0x109, 0x10A, 0x10B
# Big Endian (按比特值)
00000000, 00000000, 00000000, 00000001
00000000, 00000000, 00000000, 00000002
00000000, 00000000, 00000000, 00000003
# Little Endian (按比特值)
00000001, 00000000, 00000000, 00000000
00000002, 00000000, 00000000, 00000000
00000003, 00000000, 00000000, 00000000
</code></pre></div></div>

<p>在文件的场景下, 文件在进行网络传输或是U盘拷贝时,其字节顺序不会发生变化(拷贝和传输时,并不知道文件中哪些字节应该组合在一起形成有意义的内容,因此这些过程字节顺序只能是原封不动的). 如何理解文件中的字节由写入和读取程序.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">struct</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">"x.dat"</span><span class="p">,</span> <span class="s">"wb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">fw</span><span class="p">:</span>
    <span class="n">fw</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"&lt;I"</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>  <span class="c1"># b'\x01\x00\x00\x00'
</span>    <span class="n">fw</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"&gt;I"</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span>  <span class="c1"># b'\x00\x00\x00\x02'
</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">"x.dat"</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">fr</span><span class="p">:</span>
    <span class="n">x</span> <span class="o">=</span> <span class="n">fr</span><span class="p">.</span><span class="n">read</span><span class="p">(</span><span class="mi">8</span><span class="p">)</span>  <span class="c1"># b'\x01\x00\x00\x00\x00\x00\x00\x02'
</span> 
<span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">"&lt;I"</span><span class="p">,</span> <span class="n">x</span><span class="p">[:</span><span class="mi">4</span><span class="p">])</span>  <span class="c1"># (1,)
</span><span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">"&gt;I"</span><span class="p">,</span> <span class="n">x</span><span class="p">[</span><span class="mi">4</span><span class="p">:])</span>  <span class="c1"># (2,)
</span></code></pre></div></div>

<p>在上面的例子中, 实际上是写入程序与读取程序约定了如下协议: 文件中包含两个 int32 的数字, 第一个数字采用 Little Endian, 第二个数字采用 Big Endian.</p>

<p><code class="language-plaintext highlighter-rouge">numpy.ndarray.newbyteorder</code></p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># S 表示使用相反的字节序
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'S'</span><span class="p">)</span>  <span class="c1"># array([256, 512], dtype=int16)
# = 和 I 表示使用相同的字节序
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'='</span><span class="p">)</span>  <span class="c1"># array([1, 2], dtype=int16)
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'I'</span><span class="p">)</span>  <span class="c1"># array([1, 2], dtype=int16)
</span>
<span class="c1"># 以下两个的输出与系统有关, 下面假设系统默认的是 Little Endian
</span><span class="kn">import</span> <span class="nn">sys</span>
<span class="n">sys</span><span class="p">.</span><span class="n">byteorder</span>  <span class="c1"># 'little'
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'&lt;'</span><span class="p">)</span>  <span class="c1"># array([1, 2], dtype=int16)
</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int16</span><span class="p">).</span><span class="n">newbyteorder</span><span class="p">(</span><span class="s">'&gt;'</span><span class="p">)</span>  <span class="c1"># array([256, 512], dtype=int16)
</span></code></pre></div></div>

  </div>

  <script>
    tocbot.init({
      tocSelector: '#toc',
      contentSelector: '.post-content.e-content',
      headingSelector: 'h1, h2, h3, h4, h5',
      hasInnerContainers: true,
      collapseDepth: 3
    });
  </script><a class="u-url" href="/2024/09/21/misc.html" hidden></a>
</article>

  </body>
</html>

附录:搭建 http 服务

最简单的操作方式如下,用虚拟机(ubuntu 18.04)做实验。

准备工作:

  • 宿主机与虚拟机网络互通:参见 CSDN 博客

  • 查询宿主机与虚拟机 IP 的命令如下:

    # 宿主机 Windows 10
    ipconfig  # 假定为 192.168.1.105
    # 虚拟机 Ubuntu 18.04
    hostname -I  # 假定为 192.168.1.102
    

操作过程参考 CSDN 博客。简述如下,在虚拟机中执行

sudo apt update
sudo apt install apache2

之后使用 sudo_site 中的内容复制到 /var/www/html 目录下。之后即可在宿主机的浏览器通过访问虚拟机 IP 来访问网页,例如:

http://192.168.1.102
http://192.168.1.102/a