2008年10月7日 星期二

Linux Command - vi, use Vi to delete ^M signature

My experience

在Unix/BSD中,要顯示

^M等符號的方法,
按住Ctrl + V + M就會產生" ^M "的符號

實際例子:
當從windows傳送檔案到Unix/BSD之後,
或是Unix互傳檔案時,沒有使用ascii mode,而使用binary mode時

文字檔的每行行末會出現 ^M 的符號,這樣很難看,希望將這個符號刪除。

利用vi的替代功能

將^M改為空白符號
:%s/(Ctrl + V +M)//g
:1,$s/(Ctrl + V +M)//g

不過在剛拿到這些scripts(經由E-Mail得到),發現其在DOM內用vi觀看時,會發現在每行的行尾都有一個^M符號,而在Linux Source下觀看則沒有,使得每次想run該script時,都必須先進vi,以人工的方式將這個符號刪除,但是由於資料是儲存在ramdisk中,故每次重新開機後,資料就會恢復原狀,這個問題困擾我滿久的。
因為剛開始就懷疑原因可能是UNIX和DOS在做文字檔轉換的時候,所多出來的符號,但是原本是朝向由vi內部去做符號搜尋替換的方向,或是由vi的特殊功能來改善,但是試過很多方法沒有效果。
不過現在發現這個工具”dos2unix”,在嘗試性的run過一遍後,竟然驚奇的發現原來的^M符號不見了,現在就將方法詳列於下
//指令格式dos2unix –n infile outfile
// If the dest file is new created file, you should add "-n" parameter
# dos2unix –n wanfo.html wanfo

// If the dest file is already existed, you should add "-o" parameter or just leave the parameter empty
# dos2unix -o wanfo.html
# dos2unix wanfo.html

原本wanfo.html內有^M符號,經過下面指令執行後,產生出來的wanfo,並沒有^M符號


some experiences of other advisor
http://newbiedoc.sourceforge.net/text_editing/vi.html#SEARCHING

Searching and Replacing

Searching text is done with the command /xxx for a forwards search or ?yyy for a backwards search. n will skip to the next occurrence.When specifying / without argument vi will default to the argument of the last search.

Global Search and Replace:

The magic command is

:line1,line2s/old_string/new_string/g

The /g is optional, it means 'do the replace everytime'.If not specified, vi will replace only the first occurrence in each line.

Special ^XX characters

To search for a ^XX character, you must use Ctrl-v (^v) in order to disable interpretation of the Ctrl commands.

A useful example:

Windows (MS-DOS) text files use RETURN/LINEFEED to end every line; Mac uses only RETURN; and Unix/Linux uses only NEWLINE (which is the same as the linefeed in DOS). To use the linux programming style:

\r\n = chr(13)chr(10) = MS-DOS

\r = chr(13) = Mac

\n = chr(10) = Linux/Unix

So when displaying an msdos ascii file with vi (or with any other text editor), you will find each and every line ended by a ^M (it's character 13, aka \r, aka ENTER). When displaying a mac ascii file, you will have a single line with a ^M at what should be each end of line.

Our MSDOS text file should look like this:

Friday the 13th^M
^M
^M
^M
Dear Sir,^M
^M
....

And our mac text file should look like that

"Friday the 13th^M^M^MDearSir,^M^M...."

[The Macintosh->Unix conversion isn't easy to do with vi macros, so we'll concentrate on msdos/windows->Unix]

MS-DOS/Windows -> UNIX conversion:

In order to remove these ugly ^M, you search for them and replace them by....nothing!

So first, let's search for those weird ^M ... but, how can you search for character 'ENTER'?

By preceeding it with ^V (Control-V). Any keystroke after ^V is accepted literally -- that is, it won't have its usual command function, if it's something like ESC, ENTER, ^Z, etc...

What the following command tells VI to do is to replace the first (since the /g option isn't set, but anyway, we only expect one) ^M on every line, with nothing (there is nothing between the last two slashes: //):

This is what you type

                     :1,$s/^V^M//   

(where ^V is Control-V, and ^M is ENTER or Control-M)

note that VI doesn't display the ^V, so you'll only see

                   :1,$s/^M//

This what you actually see

And it should work...

[FYI the "text-edition task force" is working on an elegant way to convert mac ascii files to unix, but the first research campaign hasn't brought us much]

沒有留言: