Shell Scripting in Python

All you can do with a shell script is make it worse. But since this is Python, you can make it better.

Neopythonic: Overheard
— Guido van Rossum

#!/usr/bin/env python 
import subprocess, sys

cmd = 'ls -l'
retcode = subprocess.call(cmd, shell=True) 
if retcode != 0: sys.exit(retcode)

	Python Shebang Line。
	`subprocess.call()` 的回傳值，直接就是 cmd 執行過後的 exit status。

執行外部程式，或用 Python 來寫 Shell Scripts

Python 2.4 新增了 subprocess 這個 module，可以用來執行外部程式（也就是 spawn 一個 subprocess 或 child process），它的出現是為了取代舊有的 module 或 function，包括 os.system()、os.spawn*()、os.popen*()、popen2.* 與 commands.*。

最常用的就是 subprocess.call()，它可以帶參數執行外部程式，等外部程式結束之後再回傳 return code，可以用來判斷執行結果是成功還是失敗：

subprocess.call(args, executable=None, shell=False)

其中 args 可以傳入 sequence of strings 或單一個 string。注意 subprocess 的某些行為在 Windows 與 Unix 下會有差異：

Unix

>>> subprocess.call('echo Hello World')      
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/subprocess.py", line 470, in call
    return Popen(*popenargs, **kwargs).wait()
  File "/usr/lib/python2.6/subprocess.py", line 623, in __init__
    errread, errwrite)
  File "/usr/lib/python2.6/subprocess.py", line 1141, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory 
>>> subprocess.call(['echo', 'Hello World']) 
Hello World                                  
0                                            
>>>
>>> subprocess.call('echo Hello World', shell=True)      
Hello World
0
>>> subprocess.call(['echo', 'Hello World'], shell=True) 

0
>>> subprocess.call(['echo Hello World', 'shell arguments'], shell=True)
Hello World
0

	`shell=False` 時，單一個 string 的 args 會被解讀成 executable（就這個例子而言，`echo Hello World` 這個執行檔並不存在）；這個用法只適用於沒有 argument 的情況。
	要執行的程式不存在時，會丟出 `OSError`。
	`shell=False` 時，把 executable 跟 argument 拆開，就可以避開上述的問題。
	外部程式的輸出會直接寫到 standard output。
	這是外部程式的 return code，可用來判斷執行結果是成功還是失敗。
	`shell=True` 時，executable 預設是 `/bin/sh`（因此 args 本身或第一個 item 就不會被視為 executable），單一個 string 的 args 會被視為 command string 整個丟給 shell 執行。
	`shell=True` 時，只有第一個 item 會被視為 command string 丟給 shell 執行，但剩下的 item 則當做 shell 的 argument。就這個例子而言，`echo` 沒有接任何 argument（`'Hello World'` 是給 shell 用的），因此 standard output 沒有輸出任何東西。

什麼時候要用 shell=True？又使用 shell=True 有什麼要注意的地方？

從下面的執行結果來看，使用 shell=False（預設）或 shell=True 並沒有什麼差別：

>>> subprocess.call(['echo', 'Hello World']) # shell=False
Hello World
0
>>>
>>> subprocess.call('echo Hello World', shell=True)
Hello World
0

但像 fg 這類 built-in command，就一定得透過 shell 來執行不可：

>>> subprocess.call('fg')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/subprocess.py", line 470, in call
    return Popen(*popenargs, **kwargs).wait()
  File "/usr/lib/python2.6/subprocess.py", line 623, in __init__
    errread, errwrite)
  File "/usr/lib/python2.6/subprocess.py", line 1141, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory
>>> subprocess.call('fg', shell=True)
fg: 1: No current job
2

上面提到 shell=True 時，args 本身或第一個 item 會被視為 command string 整個丟給 shell 執行。也就是說 args 的寫法要跟平常在 shell prompt 裡輸入指令時一樣，同樣要考慮 quoting 或 backslash escaping 的問題，command substitution、shell parameter expansion 等也都會作用… 例如：

>>> subprocess.call(['echo', '`whoami`'])
`whoami` 
0
>>> subprocess.call('echo `whoami`', shell=True)
jeremy
0
>>> subprocess.call('echo $LOGNAME', shell=True)
jeremy
0

顯然 shell=False 時，command substitution 並不會作用。

因此透過 shell=True 可以很快地將現有的 shell script 轉成 Python 的版本。不過如果不需要 shell 提供 substitution 或 expansion 的支援的話，採用 shell=False 反而可以省去 quoting 或 backslash escaping 的麻煩。例如：

>>> subprocess.call("echo I'm file", shell=True)
/bin/sh: Syntax error: Unterminated quoted string
2
>>> subprocess.call("echo \"I'm file\"", shell=True)
I'm file
0
>>> subprocess.call(['echo', "I'm file"])
I'm file
0

Windows

>>> subprocess.call('echo Hello World')                 
Hello World
0
>>> subprocess.call('echo Hello World', shell=True)     
Hello World
0
>>> subprocess.call(['echo', 'Hello World'])
Hello World
0
>>> subprocess.call(['echo', 'Hello World'], shell=True) 
"Hello World"                                            
0

>>> subprocess.call('echo %COMSPEC%')
%COMSPEC%
0
>>> subprocess.call('echo %COMSPEC%', shell=True)        
C:\Windows\system32\cmd.exe
0
>>> subprocess.call(['echo', '%COMSPEC%'], shell=True)   
C:\Windows\system32\cmd.exe
0

>>> subprocess.call(r'c:\Program Files\Java\jre7\bin\java.exe -version') 
java version "1.7.0"
Java(TM) SE Runtime Environment (build 1.7.0-b147)
Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode)
0
>>> subprocess.call(r'c:\Program Files\Java\jre7\bin\java.exe -version', shell=True)
'c:\Program' is not recognized as an internal or external command, operable program or batch file.
1
>>> subprocess.call(r'"c:\Program Files\Java\jre7\bin\java.exe" -version', shell=True) 
java version "1.7.0"
Java(TM) SE Runtime Environment (build 1.7.0-b147)
Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode)
0

>>> subprocess.call('del') 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\subprocess.py", line 493, in call
    return Popen(*popenargs, **kwargs).wait()
  File "C:\Python27\lib\subprocess.py", line 679, in __init__
    errread, errwrite)
  File "C:\Python27\lib\subprocess.py", line 893, in _execute_child
    startupinfo)
WindowsError: [Error 2] The system cannot find the file specified
>>> subprocess.call('del', shell=True)
The syntax of the command is incorrect.
1

	`shell=False` 時，並不會像在 Unix 一樣，將單一個 string 的 args 解讀成 executable。
	`shell=True` 時，executable 預設是環境變數 `COMSPEC` 的內容，例如 `"C:\Windows\system32\cmd.exe"`。
	看起來 args 的型式（單一個 string 或 sequence of strings）或 `shell` 的值，都對執行結果沒有影響？
	為什麼 Hello World 兩側多了雙引號？
	`shell=True` 時，environment variable 才會被展開（不論 args 的型式為何）。
	簡單歸納一下，`subprocess.call()` 的行為並不會受 args 的型式影響，但 `shell=True` 與 `shell=False` 則有差異。
	讓人比較不解的是，為什麼這個例子可以執行，反倒是加了 `shell=True` 之後才出狀況？
	像平常在 DOS prompt 輸入的一樣，在含有空白字元的路徑兩側加上雙引號即可。
	跟在 Unix 一樣，像 `del`、`dir` 這類 built-in command，還是得透過 shell 來執行才行。

從上面的測試結果來看，subprocess.call() 的行為在 Unix 與 Windows 下存在著一些差異。不過下面兩種用法在不同 OS 下的行為是比較一致的：

subprocess.call(command_string, shell=True) – 其中 command_string 的寫法要跟平常在 shell prompt 裡輸入指令時一樣，同樣要考慮 quoting 或 backslash escaping 的問題。
subprocess.call(sequence_of_strings, shell=False) – 將個別的的 argument 拆開，不用考慮 quoting 或 backslash escaping 等問題。

除了 subprocess.call() 之外，另外還有 check_call() 跟 check_output()。

除了 check_output 會回傳 output 之外，用法相同跟 call() 一樣。只不過會自動檢查 return code，如果是非零值的話，就會直接丟出 CalledProcessError，這個時候 return code 還是可以透過 returncode 這個 attribute 來取回。例如：（但是要執行的程式不存在時，還是會丟出 OSError）

>>> subprocess.call('rm')
rm: missing operand
Try `rm --help' for more information.
1
>>> try:
...     subprocess.check_call('rm')
... except Exception as e:
...     print 'error code: %i' % e.returncode
...
rm: missing operand 
Try `rm --help' for more information.
error code: 1

>>> output = subprocess.check_output('uname -a', shell=True)
>>> type(output), output
(<type 'str'>, 'Linux jeremy-desktop 2.6.38-13-generic #52-Ubuntu SMP Tue Nov 8 16:48:07 UTC 2011 i686 i686 i386 GNU/Linux\n')

	發生錯誤時，外部程式的輸出還是會直接寫到 standard output。
	傳回的 byte string 包含外部程式的輸出。

用 Python 來寫 shell script 時，check_call() 跟 check_output() 是簡化 error handling 的好工具（類似 shell script 裡 set -o errexit 的效果），省去要一直檢查 return code 的麻煩。不過要注意 subprocess.check_output() 是 Python 2.7 才有的功能。

非同步（asynchronously）執行外部程式

上面的 subprocess.call() 雖然好用，不過控制權必須等外部程式結束之後才會回到 Python 程式。如果要非同步執行外部程式，或想在執行期與 child process 溝通，就必須直接使用 subprocess.call() 底層的 subprocess.Pope 才行。

事實上，subprocess.call() 的參數與 Popen 的 constructor 完全一樣，只是 subprocess.call() 內部在取得 Popen 的 instance 之後（可以將它視為與 child process 溝通的介面），會呼叫 Popen.wait() 等待外部程式執行結束，然後傳回 return code 而已。

process = subprocess.Popen(args, executable=None, shell=False)

在 Ubuntu 下，以 GNOME calculator 做簡單的測試：

>>> import subprocess
>>> process = subprocess.Popen('gcalctool') 
>>> process.wait() 
0
>>> process = subprocess.Popen('gcalctool')
>>> process.poll() 
>>> process.poll() 
0

	叫出計算機。
	程式會卡在這裡，直到手動將計算機關掉為止，才會有 return code 傳回來。
	重新開啟計算機，呼叫 `process.poll()` 傳回 `None`，表示還在執行。
	將計算機關掉後，再呼叫 `process.poll()` 一次，傳回 return code。

整理一下 Popen 幾個跟 child process 互動的方法：

Popen.poll() – 檢查 child process 是否還在執行。注意這個方法並不會回傳 True/False，而是用 None 來表示程式還在執行，如果程式已經結束則會傳回 return code。
Popen.wait() – 等 child process 結束之後，才傳回 return code，並將控制權交回 Python。
Popen.communicate(input=None) – 主要用來取回 stdout 跟 stderr 的輸出。

>>> import subprocess
>>> from subprocess import PIPE

>>> process = subprocess.Popen('uname -a', shell=True)
>>> Linux jeremy-laptop 2.6.35-30-generic #61-Ubuntu SMP Tue Oct 11 17:52:57 UTC 2011 x86_64 GNU/Linux 
>>> process.communicate()
(None, None)

>>> process = subprocess.Popen('uname -a', shell=True, stdout=PIPE, stderr=PIPE)
>>> process.communicate()
('Linux jeremy-laptop 2.6.35-30-generic #61-Ubuntu SMP Tue Oct 11 17:52:57 UTC 2011 x86_64 GNU/Linux\n', '')
>>> process.returncode
0

必須搭配 stdout=PIPE 跟 stderr=PIPE 使用（轉向到 Python 內部），否則會直接輸出到 stdout 或 stderr。

參考資料

常見問題

`.msi` 不是可執行檔？

.msi 不能直接用 subprocess 執行：

>>> import subprocess
>>> subprocess.Popen(r'C:\tmp\installer.msi')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\subprocess.py", line 679, in __init__
    errread, errwrite)
  File "C:\Python27\lib\subprocess.py", line 893, in _execute_child
    startupinfo)
WindowsError: [Error 193] %1 is not a valid Win32 application

問題就出在 .msi 本身不是一個可執行檔，必須要透過 msiexec 執行：

msiexec /i installer.msi

>>> import subprocess
>>> subprocess.Popen(['msiexec', '/i', r'C:\tmp\installer.msi'])

檔案持續被 child process 鎖住

>>> import subprocess, sys
>>> f = open(r'c:\tmp\test.txt', 'w')
>>> subprocess.Popen('notepad.exe')
<subprocess.Popen object at 0x00AAC790>
>>> sys.exit(0)

Python 程式結束時，c:\tmp\test.txt 繼續被 notepad.exe 鎖住，等 notepad.exe 關掉之後，這個 lock 才會解除。

Shell_Scripting_in_Python.files/subprocess_lock.png

要避免這個問題，可以搭配 close_fds=True 來使用。

subprocess.Popen('notepad.exe', close_fds=True)

關於 close_fds，官方文件是這麼說的：

on Windows, if close_fds is true then no handles will be inherited by the child process. Note that on Windows, you cannot set close_fds to true and also redirect the standard handles by setting stdin, stdout or stderr.

雖然看不太懂它在寫什麼，但大概是單純執行外部程式，而不把它當做是一個 child process 吧？

一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

在電梯裡遇見雙胞胎

腦袋不是用來裝東西，而是用來思考問題的；所以我把懶得記的、記不住的，通通寫在這裡…

Shell Scripting in Python

執行外部程式，或用 Python 來寫 Shell Scripts

非同步（asynchronously）執行外部程式

常見問題

`.msi` 不是可執行檔？

檔案持續被 child process 鎖住

參考資料

發表留言取消回覆

執行外部程式，或用 Python 來寫 Shell Scripts

非同步（asynchronously）執行外部程式

常見問題

.msi 不是可執行檔？

檔案持續被 child process 鎖住

參考資料

分享此文：

發表留言 取消回覆

`.msi` 不是可執行檔？

發表留言取消回覆