From owner-ntemacs-users@june Tue Aug 27 17:24:38 1996 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Tue" "27" "August" "1996" "16:45:00" "PDT" "George V. Reilly" "georger@microcrafts.com" nil "27" "RE: More ctrl-M stuff" "^From:" nil nil "8" nil nil nil nil] nil) Received: from joker.cs.washington.edu (joker.cs.washington.edu [128.95.1.42]) by june.cs.washington.edu (8.7.5/7.2ju) with SMTP id RAA29214 for ; Tue, 27 Aug 1996 17:24:38 -0700 Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.1.4]) by joker.cs.washington.edu (8.6.12/7.2ws+) with ESMTP id RAA30852 for ; Tue, 27 Aug 1996 17:24:37 -0700 Received: from halcyon.com (smtp2.halcyon.com [198.137.231.18]) by june.cs.washington.edu (8.7.5/7.2ju) with SMTP id QAA25586 for ; Tue, 27 Aug 1996 16:46:01 -0700 Received: from ms-smtp.wa.com by halcyon.com with SMTP id AA11191 (5.65c/IDA-1.4.4 for ); Tue, 27 Aug 1996 16:46:00 -0700 Received: by ms-smtp.wa.com with Microsoft Mail id <32238978@ms-smtp.wa.com>; Tue, 27 Aug 96 16:49:12 PDT Message-Id: <32238978@ms-smtp.wa.com> Encoding: 27 TEXT X-Mailer: Microsoft Mail V3.0 From: "George V. Reilly" To: ntemacs-users Subject: RE: More ctrl-M stuff Date: Tue, 27 Aug 96 16:45:00 PDT The solution that Vim uses, which works well in practice, is to have two variables, textauto (global) and textmode (buffer-local). If textmode is set, a file is written with DOS-style (CR-LF line separators); if it's off, the file is written with Unix-style (LF line separators). By default, textmode is set on all new buffers for DOS-like systems (DOS, OS/2, Win32) and cleared on all other systems. If textauto is set, then textmode is set for a buffer when a file is read in which has every line separated by CR-LFs and cleared otherwise. In either case, the file looks fine on screen. If you edit and write a file, the line separator settings will remain the same unless you explicitly override them. This is something I find very annoying with NT Emacs---especially when diffing a modified file against an original file which came from Unix and having diff report the whole file has changed. If the file has non-standard separator settings for the OS (e.g., LFs on NT), you'll see a note about it in the message line. -- /George V. Reilly MicroCrafts, Inc., 17371 NE 67th Ct #205, Redmond, WA 98052, USA. Tel: +1 206/250-0014 Fax: 206/250-0100 Web: www.microcrafts.com Vim 4 (vi clone) for NT & Windows 95: http://www.halcyon.com/gvr/ pgp fingerprint: e2 b4 83 64 11 52 21 ea bf d8 51 c2 11 00 78 fc From owner-ntemacs-users@june Fri Nov 1 08:18:26 1996 X-VM-v5-Data: ([nil nil nil nil t nil nil nil nil] [nil "Fri" " 1" "November" "1996" "16:34:38" "+0100" "Frederic Corne" "frederic.corne@erli.fr" "<9611011534.AA07747@orme.sunserv>" "28" "Pb of crlf with Samba and untranslate" "^From:" nil nil "11" nil nil nil nil] nil) Received: from joker.cs.washington.edu (joker.cs.washington.edu [128.95.1.42]) by june.cs.washington.edu (8.7.6/7.2ju) with SMTP id IAA10826 for ; Fri, 1 Nov 1996 08:18:26 -0800 Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.1.4]) by joker.cs.washington.edu (8.6.12/7.2ws+) with ESMTP id IAA25748 for ; Fri, 1 Nov 1996 08:18:24 -0800 Received: from polaris.gsi.fr (polaris.gsi.fr [150.175.128.2]) by june.cs.washington.edu (8.7.6/7.2ju) with ESMTP id HAA08122 for ; Fri, 1 Nov 1996 07:34:55 -0800 Received: from erli.fr ([150.175.65.76]) by polaris.gsi.fr (8.7.3/8.6.12) with SMTP id QAA04907 for ; Fri, 1 Nov 1996 16:35:53 +0100 (MET) Received: from orme.sunserv by erli.fr (4.1/SMI-4.1) id AA19201; Fri, 1 Nov 96 16:34:40 +0100 Received: by orme.sunserv (5.x/SMI-SVR4) id AA07747; Fri, 1 Nov 1996 16:34:38 +0100 Message-Id: <9611011534.AA07747@orme.sunserv> Reply-To: frederic.corne@erli.fr From: Frederic Corne To: ntemacs-users@cs.washington.edu Subject: Pb of crlf with Samba and untranslate Date: Fri, 1 Nov 1996 16:34:38 +0100 NOTE : This is a repost. It seems my previous message was lost. I have installed Samba 1.9.16p7 on my unix box and I use untranslate.el with emacs19.31.1 on my NT machine. (load "untranslate") (add-untranslated-filesystem "E:") at the top of my .emacs file When I read and write a simple file ( for ex a README file) all are OK. No crlf before and after. But when the file is of a particular mode (c, c++, text, ...) the read is correct ( no crlf) but when I save the file after modification, crlf is added. Any idea ? FC -- **** Frederic CORNE GSI-ERLI frederic.corne@erli.fr **** From da@dcs.ed.ac.uk Wed Jan 22 04:21:09 1997 X-VM-v5-Data: ([nil nil nil nil t nil nil nil nil] [nil "Wed" "22" "January" "1997" "12:20:18" "+0000" "David Aspinall" "da@dcs.ed.ac.uk" "<199701221221.EAA09932@june.cs.washington.edu>" "35" "Re: DOS (text) mode" "^From:" nil nil "1" nil nil nil nil] nil) Received: from rainich.dcs.ed.ac.uk (rainich.dcs.ed.ac.uk [129.215.160.105]) by june.cs.washington.edu (8.8.3+CSE/7.2ju) with ESMTP id EAA09932 for ; Wed, 22 Jan 1997 04:21:04 -0800 Message-Id: <199701221221.EAA09932@june.cs.washington.edu> Received: from INVOKE.demon.co.uk (actually host modem3.dcs.ed.ac.uk) by rainich.dcs.ed.ac.uk with SMTP (PP); Wed, 22 Jan 1997 12:19:57 +0000 X-Mailer: emacs 19.34.1 (via feedmail 3 Q) In-Reply-To: <199701220756.XAA25816@joker.cs.washington.edu> References: <199701151430.GAA18597@june.cs.washington.edu> <199701220756.XAA25816@joker.cs.washington.edu> From: David Aspinall To: voelker@cs.washington.edu (Geoff Voelker) Cc: da@dcs.ed.ac.uk Subject: Re: DOS (text) mode Date: Wed, 22 Jan 1997 12:20:18 +0000 > I'm unfamiliar with format-alist; what support is missing? format-alist: "List of information about understood file formats." I think it was added to deal with enriched mode where text properties are saved to the file. I don't know much about it --- I just read the doc string. From that it seems as if it might cope nicely with DOS text files, if a regular expression could be used to match the start of a file. (If not, perhaps format-alist could be extended to use a regexp or a function argument). Then it will automatically call hooks to encode and decode the buffer. I don't think this would add anything new to existing mechanisms (whether the built-in handling of binary files, or the "DOS" minor mode), but since Emacs now provides a hook for decoding different file formats it might seem wise to integrate with it? After discussions on the list about various DOS translation ideas I thought I should mention this variable. Personally I dislike the current mechanism: I would rather that files were handled in "binary" mode by default, and only in DOS-text mode if they can be deduced to be in DOS-text mode when visited. (Perhaps some file extensions should trigger DOS-text mode, but I am not convinced). There should be an easy way to switch to DOS-text mode, just as with enriched mode. I think this would be a nice behaviour for those of us that use mixed text-formats; for people who use only DOS-text, perhaps there could be a variable to enable the current DOS-loving behaviour. - David. From waider@autodealing.com Wed Mar 5 04:25:54 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Wed" " 5" "March" "1997" "11:24" "GMT" "Ronan Waide" "waider@autodealing.com" nil "20" "bug in load from ange-ftp directory?" "^From:" nil nil "3" nil nil nil nil] nil) Received: from trout.cs.washington.edu (trout.cs.washington.edu [128.95.1.178]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id EAA15893 for ; Wed, 5 Mar 1997 04:25:48 -0800 Received: from mail (gate.autodealing.com [194.125.131.131]) by trout.cs.washington.edu (8.8.5+CS/7.2ws+) with SMTP id DAA04609 for ; Wed, 5 Mar 1997 03:59:38 -0800 (PST) Received: from waider.cognotec.com by mail with smtp (Smail3.1.29.1 #3) id m0w2EoI-002mKGC; Wed, 5 Mar 97 11:24 GMT Message-Id: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Organization: AutoDealing Software, Ltd. From: Ronan Waide To: Geoff Voelker , Andrew Innes Subject: bug in load from ange-ftp directory? Date: Wed, 5 Mar 97 11:24 GMT Hiho, I'm using the recent patched version of emacs 19.34 on win95 at the moment. In an attempt to consolidate disparate emacs src and lib directories, I've put a lot of stuff on a local ftp-able machine, and I load it from there. However, emacs seems to have some trouble loading .elc files via the ftp link; it successfully downloads them to the local drive, but then fails to load them, usually complaining of a missing bracket. Doing a find-file followed by eval-current-buffer works fine, however. I suspect it may be loading the downloaded file in text-mode, since ange-ftp creates the downloaded file as a temporary file with no extension. Could either of you confirm this suspicion? Regards, Waider. I'll try hacking ange-ftp-load (again!) in the meantime. -- waider@autodealing.com / AutoDealing Software Ltd / +353-1-6766455 Never attribute to malloc that which can be adequately explained by stupidity From owner-ntemacs-users@trout Tue Apr 8 06:12:18 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Tue" " 8" "April" "1997" "13:29:59" "+0100" "Andrew Innes" "andrewi@harlequin.co.uk" nil "67" "Re: Attachments via ange-ftp" "^From:" nil nil "4" nil nil nil nil] nil) Received: from joker.cs.washington.edu (joker.cs.washington.edu [128.95.1.42]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id GAA04047 for ; Tue, 8 Apr 1997 06:12:17 -0700 Received: from trout.cs.washington.edu (trout.cs.washington.edu [128.95.1.178]) by joker.cs.washington.edu (8.6.12/7.2ws+) with ESMTP id GAA30228 for ; Tue, 8 Apr 1997 06:12:17 -0700 Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.1.4]) by trout.cs.washington.edu (8.8.5+CS/7.2ws+) with ESMTP id FAA27051 for ; Tue, 8 Apr 1997 05:31:51 -0700 (PDT) Received: from holly.cam.harlequin.co.uk (holly.cam.harlequin.co.uk [193.128.4.58]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id FAA03036 for ; Tue, 8 Apr 1997 05:31:48 -0700 Received: from propos.long.harlequin.co.uk (propos.long.harlequin.co.uk [193.128.93.50]) by holly.cam.harlequin.co.uk (8.8.4/8.7.3) with ESMTP id NAA01533; Tue, 8 Apr 1997 13:30:46 +0100 (BST) Received: from elan.long.harlequin.co.uk (elan.long.harlequin.co.uk [193.128.93.78]) by propos.long.harlequin.co.uk (8.8.4/8.6.12) with SMTP id NAA29309; Tue, 8 Apr 1997 13:29:59 +0100 (BST) Message-Id: <199704081229.NAA29309@propos.long.harlequin.co.uk> In-reply-to: (message from Kyle Jones on Tue, 1 Apr 1997 21:34:19 -0500 (EST)) From: Andrew Innes To: kyle_jones@wonderworks.com CC: gray@austin.apc.slb.com, info-vm@uunet.uu.net, ntemacs-users@cs.washington.edu Subject: Re: Attachments via ange-ftp Date: Tue, 8 Apr 1997 13:29:59 +0100 (BST) On Tue, 1 Apr 1997 13:42:36 -0600, gray@austin.apc.slb.com (Douglas Gray Stephens) said: >I suspect that my problem is PC related, but I'm not sure if it >can/should be fixed in VM, or nt-emacs, hence I'm cross posting this >to ntemacs-users@cs.washington.edu to see if the nt-emacs side have >any suggestions. Yes, this problem is PC specific (for the most part). On Tue, 1 Apr 1997 21:34:19 -0500 (EST), Kyle Jones said: >Douglas Gray Stephens writes: >>[...] >>This ^M will be causing vm to encode the message in base64. >> >>I am not sure why you've used >>insert-file-contents-literally >>instead of >>insert-file-contents > >To avoid problems with file handlers uncompressing or otherwise >fiddling with the input. Maybe this is the wrong thing to do. >I'm willing to switch to insert-file-contents and see if that >works better. Given that we are talking about including files as MIME attachments, I think using insert-file-contents-literally is, in principle, the right thing to do; the "problem" in this context is that it disables the (imperfect) file type detection code used on Windows as well as inhibiting the various handlers and hook functions. Strictly speaking, if the original file uses DOS line endings, then that is what should be transmitted (in base64 encoding if required). However, if it is simply a plain text file, it would generally be more helpful to treat it as such, and convert it to whatever line ending convention is most suitable - in this case, convert to Unix line endings so that the contents are transmitted in the clear. So, although insert-file-contents-literally is strictly correct, in this instance it would be more helpful to use a modified version which only inhibits the handlers and hook functions, but leaves the file type code in place. Such a change should be safe to make, since it will only affect Windows where it will generally do the right thing. Aside: The whole issue of how text files are handled, by the DOS and Windows ports of Emacs at least, is really overdue for a major rethink. The current method for determining whether a file is text (implicitly meaning DOS text) or binary is based on regular expression matching against the file name. This leads to all sorts of hassles, most of which could be easily avoided by using a simple content scanning heuristic to identify whether a file is text or binary, and the line ending convention (DOS, Mac, Unix) if text. Personally, I would like to see this heuristic incorporated into Emacs (on all platforms, not just DOS and Windows) - it would make editing and manipulating text files from different sources mostly transparent. I don't know how likely it is this will happen though, since the Mule capabilities currently being added to Emacs (which must deal with the more general language/charset encoding properties of files and other data streams) will probably subsume this issue, and may do so in a completely different and more general way. Still, I expect that the line ending convention is usually orthogonal to charset encoding, so maybe there is a chance to do this anyway. AndrewI From owner-ntemacs-users@trout Tue Apr 8 09:48:12 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Tue" " 8" "April" "1997" "12:06:41" "-0400" "John R. Dennis" "jdennis@ultranet.com" nil "80" "Re: Attachments via ange-ftp" "^From:" nil nil "4" nil nil nil nil] nil) Received: from joker.cs.washington.edu (joker.cs.washington.edu [128.95.1.42]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id JAA18123 for ; Tue, 8 Apr 1997 09:48:11 -0700 Received: from trout.cs.washington.edu (trout.cs.washington.edu [128.95.1.178]) by joker.cs.washington.edu (8.6.12/7.2ws+) with ESMTP id JAA30317 for ; Tue, 8 Apr 1997 09:48:10 -0700 Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.1.4]) by trout.cs.washington.edu (8.8.5+CS/7.2ws+) with ESMTP id JAA02649 for ; Tue, 8 Apr 1997 09:06:54 -0700 (PDT) Received: from cinna.ultra.net (cinna.ultra.net [199.232.56.8]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id JAA14789 for ; Tue, 8 Apr 1997 09:06:52 -0700 Received: from DAKOTA (d9.dial-3.wor.ma.ultra.net [146.115.69.73]) by cinna.ultra.net (8.8.5/ult1.04) with SMTP id MAA04163; Tue, 8 Apr 1997 12:06:41 -0400 (EDT) Message-Id: <199704081606.MAA04163@cinna.ultra.net> In-reply-to: <199704081229.NAA29309@propos.long.harlequin.co.uk> (message from Andrew Innes on Tue, 8 Apr 1997 13:29:59 +0100 (BST)) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII From: "John R. Dennis" To: andrewi@harlequin.co.uk, John Dennis CC: kyle_jones@wonderworks.com, gray@austin.apc.slb.com, ntemacs-users@cs.washington.edu Subject: Re: Attachments via ange-ftp Date: Tue, 8 Apr 1997 12:06:41 -0400 (EDT) >>>>> "Andrew" == Andrew Innes writes: Andrew> Given that we are talking about including files as MIME Andrew> attachments, I think using insert-file-contents-literally Andrew> is, in principle, the right thing to do; the "problem" in Andrew> this context is that it disables the (imperfect) file type Andrew> detection code used on Windows as well as inhibiting the Andrew> various handlers and hook functions. Andrew> The whole issue of how text files are handled, by the DOS Andrew> and Windows ports of Emacs at least, is really overdue for Andrew> a major rethink. I cannot believe how topical this issue is. I just spent all Friday morning debugging a similar problem in mime.el. Even though I had set all the variables I knew of that caused CRLF translation when inserting into a buffer... (let ((start (point)) (emx-binary-mode t) ;Stop LF to CRLF conversion in OS/2 (buffer-file-type t) ;Stop LF to CRLF conversion in DOS/NT (binary-process-input t)) ;Stop LF to CRLF conversion in DOS/NT the conversion was still happening because in fileio.c the implementation of insert-file-contents overwrites the user supplied value of buffer-file-type: current_buffer->buffer_file_type = call1 (Qfind_buffer_file_type, filename); The elisp code knew it wanted to insert the contents of the file as binary so it explicitly set buffer-file-type, but the implementation of insert-file-contents ignored that setting and tried to determine the translation mode by a regular expression match on the filename. I fixed the problem by calling insert-file-contents-literally which undefines find-buffer-file-type so the call in insert-file-contents to find-buffer-file-type won't succeed. But I don't think the C code in insert-buffer-contents should ignore the documented variable (buffer-file-type) that is supposed to toggle the CRLF translation! All of this is pretty ugly, prone to failure, and more to the point undocumented for the most part as far as I can tell. After spending the better part of day digging through the binary vs. text issues I was left with the distinct impression that most of this code is a "hack" waiting to break. I absolutely agree with Andrew that this is in need of a major rethink. To begin the discussion I will make the following observations: * Determining binary/text based on regular expression matching of filenames is fundamentally flawed. There is not enough naming discipline with filenames and extensions to make this work reliably. I have been burned by this more times than I care to remember. * The only way to tell if a file is binary is to scan the file and look for non-ascii bytes. * The documentation on the text/binary issues is woefully inadequate and the implementation is inconsistent. * The binary/text translation should be controlled by a user settable variable that is ALWAYS respected. After all, the user is ultimately more knowledgable about the contents of a file than the implementation. * There should be second user settable variable that toggles whether translation variable is automatically set based on the contents of the file. In this way you get automatic translation in the 99% of the cases you want it AND you can force the translation on/off when you have to. * We'ed all be happier without operating systems that make the artifical distinction between text and binary files and attempts to insert/delete/modify bytes that are not in the actual file to undo the damage introduced by this ill-conceived distinction in the first place (sorry, this last point was a completely personal soapbox comment :-) John Dennis From owner-ntemacs-users@trout Wed Mar 26 09:53:47 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Wed" "26" "March" "1997" "09:07:47" "-0800" "Don Erway" "derway@ndc.com" nil "27" "Re: > toggle binary/text mode of current buffer" "^From:" nil nil "3" nil nil nil nil] nil) Received: from joker.cs.washington.edu (joker.cs.washington.edu [128.95.1.42]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id JAA25357 for ; Wed, 26 Mar 1997 09:53:47 -0800 Received: from trout.cs.washington.edu (trout.cs.washington.edu [128.95.1.178]) by joker.cs.washington.edu (8.6.12/7.2ws+) with ESMTP id JAA23324 for ; Wed, 26 Mar 1997 09:53:46 -0800 Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.1.4]) by trout.cs.washington.edu (8.8.5+CS/7.2ws+) with ESMTP id JAA12324 for ; Wed, 26 Mar 1997 09:07:50 -0800 (PST) Received: from maya.ndc.com (maya.ndc.com [192.101.92.41]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id JAA22697 for ; Wed, 26 Mar 1997 09:07:49 -0800 Received: from heidi.ndc-new.com (heidi [192.101.92.15]) by maya.ndc.com (8.7.5/8.7.3) with SMTP id JAA12674 for ; Wed, 26 Mar 1997 09:06:16 -0800 (PST) Received: from HAL.ndc.com by heidi.ndc-new.com (SMI-8.6/SMI-SVR4) id JAA13517; Wed, 26 Mar 1997 09:07:47 -0800 Message-Id: <199703261707.JAA13517@heidi.ndc-new.com> In-reply-to: <199703261320.AA11627@lambda.unx.sas.com> (message from David Biesack on Wed, 26 Mar 1997 08:20:33 -0500) Mime-Version: 1.0 (generated by tm-edit 7.92) Content-Type: text/plain; charset=US-ASCII From: Don Erway To: ntemacs-users@cs.washington.edu Subject: Re: > toggle binary/text mode of current buffer Date: Wed, 26 Mar 1997 09:07:47 -0800 >>>>> "db" == David Biesack writes: db> suggested: db> (defvar binary-mode-distance 500 db> "Number of characters to search for CR/LF when looking for a binary file.") db> (defun check-buffer-file-type (filename) db> (if (and (looking-at ".*\r\n") ;; It has CR-LF sequence db> ;; and has no LF w/o CR within sight db> (not (re-search-forward "[^\r]\n]" binary-mode-distance t))) db> nil ;; so use text mode db> t)) ;; else use binary mode This works fine. However, auto detection still does not work under unix. I am running 19.32 on NT, and 19.33 on Solaris. In the 19.33 solaris version, there is no file-name-buffer-file-type-alist defined. So without this alist, and some code to process it, there is no surprise that it doesn't work. Is the idea to use winnt.el even when running on unix? If I load winnt.el into the unix version, it complains that the set-message-beep function doesn't exist. But I can always work around that if this is even the right approach. Don From owner-ntemacs-users@trout Wed Mar 26 06:03:34 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Wed" "26" "March" "1997" "08:20:33" "-0500" "David Biesack" "sasdjb@unx.sas.com" nil "41" "> toggle binary/text mode of current buffer" "^From:" nil nil "3" nil nil nil nil] nil) Received: from joker.cs.washington.edu (joker.cs.washington.edu [128.95.1.42]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id GAA14317 for ; Wed, 26 Mar 1997 06:03:33 -0800 Received: from trout.cs.washington.edu (trout.cs.washington.edu [128.95.1.178]) by joker.cs.washington.edu (8.6.12/7.2ws+) with ESMTP id GAA17020 for ; Wed, 26 Mar 1997 06:03:32 -0800 Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.1.4]) by trout.cs.washington.edu (8.8.5+CS/7.2ws+) with ESMTP id FAA07763 for ; Wed, 26 Mar 1997 05:20:43 -0800 (PST) Received: from lamb.sas.com (lamb.sas.com [192.35.83.8]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id FAA13537 for ; Wed, 26 Mar 1997 05:20:41 -0800 Received: from mozart by lamb.sas.com (5.65c/SAS/Gateway/01-23-95) id AA11423; Wed, 26 Mar 1997 08:20:39 -0500 Received: from lambda.unx.sas.com by mozart (5.65c/SAS/Domains/5-6-90) id AA21315; Wed, 26 Mar 1997 08:20:33 -0500 Received: by lambda.unx.sas.com (5.65c/SAS/Generic 9.01/3-26-93) id AA11627; Wed, 26 Mar 1997 08:20:33 -0500 Message-Id: <199703261320.AA11627@lambda.unx.sas.com> In-Reply-To: <199703252229.OAA06160@sampras.isi.com> (message from Kin Cho on Tue, 25 Mar 1997 14:29:49 -0800) From: David Biesack To: ntemacs-users@cs.washington.edu Subject: > toggle binary/text mode of current buffer Date: Wed, 26 Mar 1997 08:20:33 -0500 > ;;; This examines the actual contents of the loaded file to see if > ;;; it should use text mode or binary: > (defun check-buffer-file-type (filename) > (if (and (looking-at ".*\r\n") ;; It has CR-LF sequence > (not (search-forward "[^\r]\n]" nil t))) ;; and has no LF w/o CR > nil ;; so use text mode > t)) ;; else use binary mode Someone else pointed out that the search-forward should be a re-search-forward. However, also note that passing nil to the search will cause inspection of the entire buffer, which is not always negligible. It might be better to make this a variable as is done in dos-mode.el ;;; LCD Archive Entry: ;;; dos-mode|Andy Norman|ange@hplb.hpl.hp.com ;;; |MSDOS minor mode for GNU Emacs ;;; |$Date: 2001/02/13 00:53:57 $|$Revision: 1.1 $| which passes (min (point-max) dos-mode-distance) to re-search-forward where (defvar dos-mode-distance 200 "Number of characters to search for RETURN when looking for a DOS file.") to determine if a file is in DOS CR/LF mode. You can change dos-mode-distance to 1000 or some other reasonable value in your .emacs suggested: (defvar binary-mode-distance 500 "Number of characters to search for CR/LF when looking for a binary file.") (defun check-buffer-file-type (filename) (if (and (looking-at ".*\r\n") ;; It has CR-LF sequence ;; and has no LF w/o CR within sight (not (re-search-forward "[^\r]\n]" binary-mode-distance t))) nil ;; so use text mode t)) ;; else use binary mode From owner-ntemacs-users@trout Tue Mar 25 18:23:53 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Tue" "25" "March" "1997" "20:20:21" "-0500" "Geoff Odhner" "odhner@recom.com" nil "69" "Re: toggle binary/text mode of current buffer" "^From:" nil nil "3" nil nil nil nil] nil) Received: from joker.cs.washington.edu (joker.cs.washington.edu [128.95.1.42]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id SAA20274 for ; Tue, 25 Mar 1997 18:23:53 -0800 Received: from trout.cs.washington.edu (trout.cs.washington.edu [128.95.1.178]) by joker.cs.washington.edu (8.6.12/7.2ws+) with ESMTP id SAA17699 for ; Tue, 25 Mar 1997 18:23:51 -0800 Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.1.4]) by trout.cs.washington.edu (8.8.5+CS/7.2ws+) with ESMTP id RAA27386 for ; Tue, 25 Mar 1997 17:19:46 -0800 (PST) Received: from recom.recom.com (freeholders.co.camden.nj.us [204.213.88.1]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id RAA15955 for ; Tue, 25 Mar 1997 17:19:46 -0800 Received: from odhner (dial31.mt-holly.emanon.net [204.213.88.131]) by recom.recom.com (8.6.12/8.6.9) with SMTP id UAA02882; Tue, 25 Mar 1997 20:25:16 -0500 Message-ID: <333879D5.2FEC@recom.com> X-Mailer: Mozilla 2.01Gold (Win95; I) MIME-Version: 1.0 References: <199703241934.LAA10013@heidi.ndc-new.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit From: Geoff Odhner To: Don Erway CC: kin@isi.com, ntemacs-users@cs.washington.edu Subject: Re: toggle binary/text mode of current buffer Date: Tue, 25 Mar 1997 20:20:21 -0500 Don Erway wrote: > > The one funny is that files to which I have only read-only access come up as > > writeable. > > I spoke too soon. It appears that the visiting read-only files works on NT, > but under unix, a read-only file is not translated correctly. > check-buffer-file-type works, but translation does not occur. > > On NT, the files do get translated, and do not come up as writeable. Try my latest version. It should address this problem. It works on win95, but I haven't yet tested it on unix, though I'm expecting no problem. BTW, one caveat about using this on unix: Though this code works to toggle the buffer type, the mode line indicator doesn't work on unix, at least not on SunOS. If you add the mode line %t indicator, it always indicates T on the mode line. I expect that requires a fix to the C code and a recompile. I guess they figured noone would ever use it on unix. :-) Happy editing... -Geoff And here's the new version, as promised: ;;; If you have loaded a file as binary that actually has the ^M's in it, ;;; then switching to text mode will remove them in the buffer. Of course ;;; now that it's in text mode, it will save with the ^M's inserted. ;;; Switching to binary mode does NOT have a reverse effect. If you want ;;; to disable that change on entering text mode, then use a negative ;;; prefix argument, as described below. ;;; A prefix argument will force the mode change in a particular ;;; direction. A positive prefix argument forces it to binary. A zero ;;; prefix argument forces text mode allowing the removal of ^M's (only ;;; preceding ^J's). A negative prefix argument forces text mode ;;; disallowing the removal of ^M's. ;;; When the mode is changed the state of modification of the buffer is ;;; preserved, even if the ^M's are removed. (defun toggle-buffer-file-type (arg) "Alternate value of buffer-file-type" (interactive "P") (let ((old buffer-file-type) (mod (buffer-modified-p)) (buffer-read-only nil)) (setq buffer-file-type (if arg (>= arg 1) (not buffer-file-type))) (if (and old (not buffer-file-type) (or (not arg) (> arg -2))) (save-excursion (beginning-of-buffer) (while (search-forward "\r\n" nil t) (replace-match "\n" nil t)) (set-buffer-modified-p mod)))) (force-mode-line-update)) ;; And my preferred key binding: (global-set-key [?\A-t] 'toggle-buffer-file-type) From kin@isi.com Tue Mar 25 14:28:47 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Tue" "25" "March" "1997" "14:29:49" "-0800" "Kin Cho" "kin@isi.com" nil "30" "Re: toggle binary/text mode of current buffer" "^From:" nil nil "3" nil nil nil nil] nil) Received: from sampras.isi.com (sampras.isi.com [192.103.53.29]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id OAA03847; Tue, 25 Mar 1997 14:28:47 -0800 Received: (from kin@localhost) by sampras.isi.com (8.6.10/8.6.10) id OAA06160; Tue, 25 Mar 1997 14:29:49 -0800 Message-Id: <199703252229.OAA06160@sampras.isi.com> In-reply-to: <33355BAF.2353@recom.com> (message from Geoff Odhner on Sun, 23 Mar 1997 11: 34:55 -0500) From: Kin Cho To: odhner@recom.com, voelker@cs.washington.edu CC: derway@ndc.com, ntemacs-users@cs.washington.edu Subject: Re: toggle binary/text mode of current buffer Date: Tue, 25 Mar 1997 14:29:49 -0800 Thanks, this is good! A real solution as compared to the workarounds that came before. If only this works in UNIX as well! Please put it in the FAQ, or even better, integrate it with main line code. -kin p.s., this is my mod: (list (cons "" 'check-buffer-file-type)))) ;;; Associate the universal match regexp "" with the ;;; function check-buffer-file-type, so any file will be ;;; examined to automatically select the appropriate mode. ;;; Add this check only after known filename patterns are ;;; treated the way they should be. (That's why we append ;;; to the list, instead of replacing it). You might want ;;; to use more more restrictive pattern(s) for doing this ;;; check. (setq file-name-buffer-file-type-alist (append file-name-buffer-file-type-alist (list (cons "" 'check-buffer-file-type)))) ;;; This examines the actual contents of the loaded file to see if ;;; it should use text mode or binary: (defun check-buffer-file-type (filename) (if (and (looking-at ".*\r\n") ;; It has CR-LF sequence (not (search-forward "[^\r]\n]" nil t))) ;; and has no LF w/o CR nil ;; so use text mode t)) ;; else use binary mode From owner-ntemacs-users@trout Sun Mar 23 09:13:15 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Sun" "23" "March" "1997" "11:34:55" "-0500" "Geoff Odhner" "odhner@recom.com" nil "33" "Re: toggle binary/text mode of current buffer" "^From:" nil nil "3" nil nil nil nil] nil) Received: from joker.cs.washington.edu (joker.cs.washington.edu [128.95.1.42]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id JAA20826 for ; Sun, 23 Mar 1997 09:13:15 -0800 Received: from trout.cs.washington.edu (trout.cs.washington.edu [128.95.1.178]) by joker.cs.washington.edu (8.6.12/7.2ws+) with ESMTP id JAA26846 for ; Sun, 23 Mar 1997 09:13:13 -0800 Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.1.4]) by trout.cs.washington.edu (8.8.5+CS/7.2ws+) with ESMTP id IAA08257 for ; Sun, 23 Mar 1997 08:34:38 -0800 (PST) Received: from recom.recom.com (freeholders.co.camden.nj.us [204.213.88.1]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id IAA19986 for ; Sun, 23 Mar 1997 08:34:37 -0800 Received: from odhner (dial15.mt-holly.emanon.net [204.213.88.115]) by recom.recom.com (8.6.12/8.6.9) with SMTP id LAA11941; Sun, 23 Mar 1997 11:39:44 -0500 Message-ID: <33355BAF.2353@recom.com> X-Mailer: Mozilla 2.01Gold (Win95; I) MIME-Version: 1.0 References: <199703222055.MAA14453@heidi.ndc-new.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit From: Geoff Odhner To: Don Erway CC: kin@isi.com, ntemacs-users@cs.washington.edu Subject: Re: toggle binary/text mode of current buffer Date: Sun, 23 Mar 1997 11:34:55 -0500 Don Erway wrote: > Finally, it needs an auto option, to make it possible to > automatically go into text mode if the content is strictly > text and crlfs are already present in a file. This should > not be based on file name extensions or file systems, but > only on file content. I have written a few more bits of code that help automate binary/text mode selection. This approach is possible due to the Geoff Voelker's foresight in designing the infrastructure to be configurable in this way. Thanks, Geoff. ;;; Associate the universal match regexp "" with the ;;; function check-buffer-file-type, so any file will be ;;; examined to automatically select the appropriate mode. ;;; Add this check only after known filename patterns are ;;; treated the way they should be. (That's why we append ;;; to the list, instead of replacing it). You might want ;;; to use more more restrictive pattern(s) for doing this ;;; check. (setq file-name-buffer-file-type-alist (append file-name-buffer-file-type-alist (cons "" 'check-buffer-file-type))) ;;; This examines the actual contents of the loaded file to see if ;;; it should use text mode or binary: (defun check-buffer-file-type (filename) (if (and (looking-at ".*\r\n") ;; It has CR-LF sequence (not (search-forward "[^\r]\n]" nil t))) ;; and has no LF w/o CR nil ;; so use text mode t)) ;; else use binary mode From owner-ntemacs-users@trout Sat Mar 22 13:28:17 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Sat" "22" "March" "1997" "12:55:31" "-0800" "Don Erway" "derway@ndc.com" nil "23" "Re: toggle binary/text mode of current buffer" "^From:" nil nil "3" nil nil nil nil] nil) Received: from joker.cs.washington.edu (joker.cs.washington.edu [128.95.1.42]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id NAA14305 for ; Sat, 22 Mar 1997 13:28:17 -0800 Received: from trout.cs.washington.edu (trout.cs.washington.edu [128.95.1.178]) by joker.cs.washington.edu (8.6.12/7.2ws+) with ESMTP id NAA23216 for ; Sat, 22 Mar 1997 13:28:15 -0800 Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.1.4]) by trout.cs.washington.edu (8.8.5+CS/7.2ws+) with ESMTP id MAA24518 for ; Sat, 22 Mar 1997 12:56:39 -0800 (PST) Received: from maya.ndc.com (maya.ndc.com [192.101.92.41]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id MAA13538 for ; Sat, 22 Mar 1997 12:56:38 -0800 Received: from heidi.ndc-new.com (heidi [192.101.92.15]) by maya.ndc.com (8.7.5/8.7.3) with SMTP id MAA05675; Sat, 22 Mar 1997 12:54:00 -0800 (PST) Received: from HAL.ndc.com by heidi.ndc-new.com (SMI-8.6/SMI-SVR4) id MAA14453; Sat, 22 Mar 1997 12:55:31 -0800 Message-Id: <199703222055.MAA14453@heidi.ndc-new.com> In-reply-to: <33343F98.2A7@recom.com> (message from Geoff Odhner on Sat, 22 Mar 1997 15:22:48 -0500) Mime-Version: 1.0 (generated by tm-edit 7.92) Content-Type: text/plain; charset=US-ASCII From: Don Erway To: odhner@recom.com CC: kin@isi.com, ntemacs-users@cs.washington.edu Subject: Re: toggle binary/text mode of current buffer Date: Sat, 22 Mar 1997 12:55:31 -0800 This is good. I can now happily make everything binary by default, and use your toggle funciton for the few cases it is really needed. This is better than using crypt's DOS mode, because it is faster. Now, if only it would work in unix emacs, we could completely share files either way. Finally, it needs an auto option, to make it possible to automatically go into text mode if the content is strictly text and crlfs are already present in a file. This should not be based on file name extensions or file systems, but only on file content. Thanks for the useful hack. Don Don Erway derway@ndc.com NDC Systems 818-939-3847 5314 N. Irwindale Ave Fax:939-3870 Irwindale, CA, 91706 From owner-ntemacs-users@trout Sat Mar 22 13:01:33 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Sat" "22" "March" "1997" "15:22:48" "-0500" "Geoff Odhner" "odhner@recom.com" nil "52" "Re: toggle binary/text mode of current buffer" "^From:" nil nil "3" nil nil nil nil] nil) Received: from joker.cs.washington.edu (joker.cs.washington.edu [128.95.1.42]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id NAA13680 for ; Sat, 22 Mar 1997 13:01:33 -0800 Received: from trout.cs.washington.edu (trout.cs.washington.edu [128.95.1.178]) by joker.cs.washington.edu (8.6.12/7.2ws+) with ESMTP id NAA25772 for ; Sat, 22 Mar 1997 13:01:31 -0800 Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.1.4]) by trout.cs.washington.edu (8.8.5+CS/7.2ws+) with ESMTP id MAA23716 for ; Sat, 22 Mar 1997 12:22:12 -0800 (PST) Received: from recom.recom.com (recom.recom.com [204.213.88.1]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id MAA12359 for ; Sat, 22 Mar 1997 12:22:12 -0800 Received: from odhner (dial5.mt-holly.emanon.net [204.213.88.105]) by recom.recom.com (8.6.12/8.6.9) with SMTP id PAA01331; Sat, 22 Mar 1997 15:27:36 -0500 Message-ID: <33343F98.2A7@recom.com> X-Mailer: Mozilla 2.01Gold (Win95; I) MIME-Version: 1.0 References: <199703212025.MAA03727@sampras.isi.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit From: Geoff Odhner To: Kin Cho CC: ntemacs-users@cs.washington.edu Subject: Re: toggle binary/text mode of current buffer Date: Sat, 22 Mar 1997 15:22:48 -0500 Kin Cho wrote: > > Is that a function that does this? > Trying to work around yet another PC<->UNIX integration problem. > > Thanks. > > -kin I have yet another version of my toggle-buffer-file-type function. This one updates the status bar, which is necessary if you bind it to a key. -Geoff ;;; If you have loaded a file as binary that actually has the ^M's in it, ;;; then switching to text mode will remove them in the buffer. Of course ;;; now that it's in text mode, it will save with the ^M's inserted. ;;; Switching to binary mode does NOT have a reverse effect. If you want ;;; to disable that change on entering text mode, then use a negative ;;; prefix argument, as described below. ;;; A prefix argument will force the mode change in a particular ;;; direction. A positive prefix argument forces it to binary. A zero ;;; prefix argument forces text mode allowing the removal of ^M's (only ;;; preceding ^J's). A negative prefix argument forces text mode ;;; disallowing the removal of ^M's. ;;; When the mode is changed the state of modification of the buffer is ;;; preserved, even if the ^M's are removed. (defun toggle-buffer-file-type (arg) "Alternate value of buffer-file-type" (interactive "P") (let ((old buffer-file-type) (mod (buffer-modified-p))) (setq buffer-file-type (if arg (>= arg 1) (not buffer-file-type))) (if (and old (not buffer-file-type) (or (not arg) (> arg -2))) (save-excursion (beginning-of-buffer) (while (search-forward "\r\n" nil t) (replace-match "\n" nil t)) (set-buffer-modified-p mod)))) (force-mode-line-update)) ;;; Here's my personal selection for a key binding for this function: (global-set-key [?\A-t] 'toggle-buffer-file-type) From andrewi@harlequin.co.uk Tue Apr 15 06:15:20 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Tue" "15" "April" "1997" "14:14:36" "+0100" "Andrew Innes" "andrewi@harlequin.co.uk" nil "38" "Questions about MULE" "^From:" nil nil "4" nil nil nil nil] nil) Received: from holly.cam.harlequin.co.uk (holly.cam.harlequin.co.uk [193.128.4.58]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id GAA22624 for ; Tue, 15 Apr 1997 06:15:19 -0700 Received: from propos.long.harlequin.co.uk (propos.long.harlequin.co.uk [193.128.93.50]) by holly.cam.harlequin.co.uk (8.8.4/8.7.3) with ESMTP id OAA28494; Tue, 15 Apr 1997 14:15:10 +0100 (BST) Received: from elan.long.harlequin.co.uk (elan.long.harlequin.co.uk [193.128.93.78]) by propos.long.harlequin.co.uk (8.8.4/8.6.12) with SMTP id OAA25514; Tue, 15 Apr 1997 14:14:36 +0100 (BST) Message-Id: <199704151314.OAA25514@propos.long.harlequin.co.uk> In-reply-to: <199704150439.AAA16856@psilocin.gnu.ai.mit.edu> (message from Richard Stallman on Tue, 15 Apr 1997 00:39:58 -0400) From: Andrew Innes To: rms@gnu.ai.mit.edu cc: voelker@cs.washington.edu Subject: Questions about MULE Date: Tue, 15 Apr 1997 14:14:36 +0100 (BST) On Tue, 15 Apr 1997 00:39:58 -0400, Richard Stallman said: >I see nothing problematical in these changes. >The ones that have to do with cr conversion will have to be >redone totally differently for the next Emacs release, though, >because MULE affects this very much. I am only dimly aware of what the MULE work for 19.35 entails, so if you have time I would like to ask a few questions about it. Since the issue of DOS vs Unix line ending conventions for text files is currently handled poorly in 19.34 (in the DOS and Windows ports), there has been a fair bit of discussion recently on the ntemacs-users mailing list about possible mechanisms for improving this in the future. This applies primarily in the context of working with a mixture of text files using both line ending conventions. The main difficulties at present are that Emacs doesn't, in general, correctly identify text vs. binary files, and for text files doesn't remember which line ending convention was used. The general thrust of suggestions for improvement is to implement some kind of mostly-automatic mechanism to detect which files are text, and remember the line ending convention in use (DOS, Unix or possibly Mac). Obvious heuristics based on scanning the first part of each file when loaded for "funny" characters could be used. More sophisticated extensions which detect mistakes in the assumed format follow on from that. I know this issue overlaps somewhat with the more general language and character encoding issues that are handled by MULE, but I'm not sure how exactly. Is there any documentation about MULE, as being implemented in 19.35, that I could read? It seems to me that the line ending convention employed by text files is often orthogonal to the character encoding convention (at least for single-and multi-byte encoding, and for Unicode as well after allowing for wider characters), and so a mechanism for automatically detecting and propagating the convention in use could still be of value. AndrewI From rms@gnu.ai.mit.edu Tue Jul 1 17:55:36 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Tue" " 1" "July" "1997" "20:55:47" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" nil "20" "New way of handling CRLF" "^From:" nil nil "7" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id RAA13267 for ; Tue, 1 Jul 1997 17:55:35 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id UAA26226; Tue, 1 Jul 1997 20:55:47 -0400 Message-Id: <199707020055.UAA26226@psilocin.gnu.ai.mit.edu> From: Richard Stallman To: eliz@is.elta.co.il, voelker@cs.washington.edu Subject: New way of handling CRLF Date: Tue, 1 Jul 1997 20:55:47 -0400 The MULE features include a new way of handling CRLF conversion. It detects the need to convert CRLF using the same mechanism that detects the need to convert international character sets. One consequence of this is that it ought to succeed in editing files that use LF or files that use CRLF. Regardless of what type of system you are on and what type of file system you are using, the file will appear in the normal Emacs way, with newlines between the lines. Does this mean that some of the features for text vs binary files and untranslated file systems are now unnecessary? Can I simplify the "Text Files and Binary Files" in the manual? Please answer me as soon as you can; I am trying to finish the manual. Note: currently there is a bug: when you visit a file on Unix which uses CRLF between lines, it recognizes that, but buffer-file-coding-system is set to nil, which is not right. I will forward you the fix for this as soon as I get it. From eliz@is.elta.co.il Wed Jul 2 01:03:33 1997 X-VM-v5-Data: ([nil nil nil nil t nil nil nil nil] [nil "Wed" " 2" "July" "1997" "11:03:09" "+0300" "Eli Zaretskii" "eliz@is.elta.co.il" "" "34" "Re: New way of handling CRLF" "^From:" nil nil "7" nil nil nil nil] nil) Received: from is.elta.co.il (is.elta.co.il [199.203.121.2]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id BAA28595 for ; Wed, 2 Jul 1997 01:03:30 -0700 Received: by is.elta.co.il (SMI-8.6/SMI-SVR4) id LAA27499; Wed, 2 Jul 1997 11:03:10 +0300 X-Sender: eliz@is In-Reply-To: <199707020055.UAA26226@psilocin.gnu.ai.mit.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII From: Eli Zaretskii To: Richard Stallman cc: voelker@cs.washington.edu Subject: Re: New way of handling CRLF Date: Wed, 2 Jul 1997 11:03:09 +0300 (IDT) On Tue, 1 Jul 1997, Richard Stallman wrote: The following is purely theoretical, based on what you told in your message. I didn't have time yet to download and install the pretest, neither do I know how does MULE detect and convert the file format. > Does this mean that some of the features for text vs binary files > and untranslated file systems are now unnecessary? Can I simplify > the "Text Files and Binary Files" in the manual? I would guess that the manual needs to be changed, but not necessarily simplified. The text vs binary thing has two aspects: reading them into Emacs and writing them back to the filesystem. No matter how smart the CRLF detection mechanism is, there will be cases when users will want a buffer to be written in specific format of their preference, which might be different from the format of the original file as read by Emacs. I'm also not sure that the CRLF detection can be made fully automatic. Imagine a binary file (like an executable program) that includes a CRLF pair somewhere; would Emacs 20 strip the CR from it when it reads that file and treat it as text? So I think Emacs 20 will need to keep the special varieties of `find-file' that specify text or binary explicitly (btw, it seems as if they aren't mentioned anywhere in the 19.34 manual). There should also be a way to tell Emacs to write a buffer (or a region) with LFs translated to CRLFs. In particular, the (un)?translated filesystem feature should be kept IMHO. If the above reasoning is true, there should be minor changes to the manual (to explain the automatic CRLF detection feature), but the bulk of the text should be kept. From eliz@is.elta.co.il Thu Jul 3 08:39:40 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Thu" " 3" "July" "1997" "18:36:11" "+0300" "Eli Zaretskii" "eliz@is.elta.co.il" nil "14" "Re: New way of handling CRLF" "^From:" nil nil "7" nil nil nil nil] nil) Received: from is.elta.co.il (is.elta.co.il [199.203.121.2]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id IAA18394 for ; Thu, 3 Jul 1997 08:39:39 -0700 Received: by is.elta.co.il (SMI-8.6/SMI-SVR4) id SAA01572; Thu, 3 Jul 1997 18:36:11 +0300 X-Sender: eliz@is In-Reply-To: <199707030040.UAA03584@psilocin.gnu.ai.mit.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII From: Eli Zaretskii To: Richard Stallman cc: voelker@cs.washington.edu Subject: Re: New way of handling CRLF Date: Thu, 3 Jul 1997 18:36:11 +0300 (IDT) On Wed, 2 Jul 1997, Richard Stallman wrote: > We have two mechanisms for deciding whether a file should have LF, not > CRLF based on the file name. One looks for "binary files" and one > looking for untranslated file systems. > > Could these be unified, I wonder? On "translated" file systems, Emacs should decide whether the file is text (and then convert CRLF -> LF) or binary. On "untranslated" file systems, all files should be read and written verbatim. From rms@gnu.ai.mit.edu Wed Jul 2 17:39:07 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Wed" " 2" "July" "1997" "20:39:37" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" nil "11" "Re: New way of handling CRLF" "^From:" nil nil "7" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id RAA17875 for ; Wed, 2 Jul 1997 17:39:06 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id UAA03573; Wed, 2 Jul 1997 20:39:37 -0400 Message-Id: <199707030039.UAA03573@psilocin.gnu.ai.mit.edu> In-reply-to: <199707021953.MAA19844@joker.cs.washington.edu> (voelker@cs.washington.edu) References: <199707020055.UAA26226@psilocin.gnu.ai.mit.edu> <199707021953.MAA19844@joker.cs.washington.edu> From: Richard Stallman To: voelker@cs.washington.edu Subject: Re: New way of handling CRLF Date: Wed, 2 Jul 1997 20:39:37 -0400 I agree with Eli that users will still want a mechanism by which files are written in a format automatically determined by Emacs. I agree. Still, I would really really appreciate it if you would tell me how things DO work now! Does Emacs automatically figure out whether a file has CRLF or LF? (There is a bug in the pretest that fails to save a file with CRLF if it was recognized with CRLF. That has been fixed.) From rms@gnu.ai.mit.edu Wed Jul 2 17:40:15 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Wed" " 2" "July" "1997" "20:40:45" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" nil "6" "Re: New way of handling CRLF" "^From:" nil nil "7" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id RAA17925 for ; Wed, 2 Jul 1997 17:40:14 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id UAA03584; Wed, 2 Jul 1997 20:40:45 -0400 Message-Id: <199707030040.UAA03584@psilocin.gnu.ai.mit.edu> In-reply-to: <199707021953.MAA19844@joker.cs.washington.edu> (voelker@cs.washington.edu) References: <199707020055.UAA26226@psilocin.gnu.ai.mit.edu> <199707021953.MAA19844@joker.cs.washington.edu> From: Richard Stallman To: voelker@cs.washington.edu CC: eliz@is.elta.co.il Subject: Re: New way of handling CRLF Date: Wed, 2 Jul 1997 20:40:45 -0400 We have two mechanisms for deciding whether a file should have LF, not CRLF based on the file name. One looks for "binary files" and one looking for untranslated file systems. Could these be unified, I wonder? And could they both be done using file-coding-system-alist now? From rms@gnu.ai.mit.edu Thu Jul 3 12:18:41 1997 X-VM-v5-Data: ([nil nil nil nil t nil nil nil nil] [nil "Thu" " 3" "July" "1997" "15:19:06" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" "<199707031919.PAA10787@psilocin.gnu.ai.mit.edu>" "12" "Re: New way of handling CRLF" "^From:" nil nil "7" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id MAA07559 for ; Thu, 3 Jul 1997 12:18:39 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id PAA10787; Thu, 3 Jul 1997 15:19:06 -0400 Message-Id: <199707031919.PAA10787@psilocin.gnu.ai.mit.edu> In-reply-to: (message from Eli Zaretskii on Thu, 3 Jul 1997 18:36:11 +0300 (IDT)) References: From: Richard Stallman To: eliz@is.elta.co.il CC: voelker@cs.washington.edu Subject: Re: New way of handling CRLF Date: Thu, 3 Jul 1997 15:19:06 -0400 On "translated" file systems, Emacs should decide whether the file is text (and then convert CRLF -> LF) or binary. On "untranslated" file systems, all files should be read and written verbatim. That is what it does now--doesn't it? So what are you trying to say? Perhaps you misunderstood my question and answered a completely different one. Right now we have two separate mechanisms to do two similar jobs. Can we replace them with one mechanism that can do both jobs? From eliz@is.elta.co.il Sun Jul 6 07:22:25 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Sun" " 6" "July" "1997" "17:22:00" "+0300" "Eli Zaretskii" "eliz@is.elta.co.il" nil "18" "Re: New way of handling CRLF" "^From:" nil nil "7" nil nil nil nil] nil) Received: from is.elta.co.il (is.elta.co.il [199.203.121.2]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id HAA24478 for ; Sun, 6 Jul 1997 07:22:23 -0700 Received: by is.elta.co.il (SMI-8.6/SMI-SVR4) id RAA08656; Sun, 6 Jul 1997 17:22:01 +0300 X-Sender: eliz@is In-Reply-To: <199707031919.PAA10787@psilocin.gnu.ai.mit.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII From: Eli Zaretskii To: Richard Stallman cc: voelker@cs.washington.edu Subject: Re: New way of handling CRLF Date: Sun, 6 Jul 1997 17:22:00 +0300 (IDT) On Thu, 3 Jul 1997, Richard Stallman wrote: > On "translated" file systems, Emacs should decide whether the file is > text (and then convert CRLF -> LF) or binary. > > On "untranslated" file systems, all files should be read and written > verbatim. > > That is what it does now--doesn't it? So what are you trying to > say? I was trying to say that the two should be combined. (un)?translated says whether the CRLF<->LF conversion is at all an issue, and the detection of the file type says whether this particular file needs the conversion, given that it belongs to a "translated" filesystem. If it already works this way, then my comments are redundant. From eliz@is.elta.co.il Sun Jul 6 07:23:08 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Sun" " 6" "July" "1997" "17:22:44" "+0300" "Eli Zaretskii" "eliz@is.elta.co.il" nil "35" "Re: New way of handling CRLF" "^From:" nil nil "7" nil nil nil nil] nil) Received: from is.elta.co.il (is.elta.co.il [199.203.121.2]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id HAA24484 for ; Sun, 6 Jul 1997 07:23:05 -0700 Received: by is.elta.co.il (SMI-8.6/SMI-SVR4) id RAA08662; Sun, 6 Jul 1997 17:22:44 +0300 X-Sender: eliz@is In-Reply-To: <199707042102.OAA34672@joker.cs.washington.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII From: Eli Zaretskii To: Geoff Voelker cc: rms@gnu.ai.mit.edu, Andrew Innes Subject: Re: New way of handling CRLF Date: Sun, 6 Jul 1997 17:22:44 +0300 (IDT) On Fri, 4 Jul 1997, Geoff Voelker wrote: > correctly (e.g., on a text file with CRLF, both a find-file and a > find-file-binary create a buffer with the text file stripped of > CRLF, What about binary files with embedded CRLFs? How can Emacs tell which files are and which aren't ``text''? If it can't, then the above behavior is wrong: I might want to use `find-file-binary' to read a binary file (e.g., an executable program) that just happens to have embedded CRLF pairs, either as part of text messages or even just an opcode that happens to look like CRLF. Will I then be presented with the file with all CRs in CRLF pairs removed? > a text file without CRLF, Emacs reads it in correctly, but the > buffer-file-type is "text", and so the file gets written out with LF > converted to CRLF. This is not a bug in the coding-system code, but > rather due to the fact that, internally, Emacs under DOS_NT looks at > the buffer-file-type, sees "text", and opens the file in text mode, > and the operating system changes LF to CRLF. I'm not sure this is a bug, either. I can imagine cases where the user would like Unix-style text files be written as DOS-style text. I haven't decided yet what the default should be here, but at least a user-definable option should be available to get the non-default behavior. > Given the new coding-system framework, I think that all file I/O under > DOS_NT should now be done in binary mode since the data that Emacs > gives to the operating system does not need any conversion. If that is the case, how would a user tell Emacs that a file which originally had no CRs should have them added on output (assuming that you agree that such cases are possible)? From rms@gnu.ai.mit.edu Sun Jul 6 17:08:03 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Sun" " 6" "July" "1997" "20:08:26" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" "<199707070008.UAA15380@psilocin.gnu.ai.mit.edu>" "15" "Re: New way of handling CRLF" "^From:" nil nil "7" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id RAA07584 for ; Sun, 6 Jul 1997 17:08:02 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id UAA15380; Sun, 6 Jul 1997 20:08:26 -0400 Message-Id: <199707070008.UAA15380@psilocin.gnu.ai.mit.edu> In-reply-to: (message from Eli Zaretskii on Sun, 6 Jul 1997 17:22:44 +0300 (IDT)) References: From: Richard Stallman To: eliz@is.elta.co.il cc: voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: New way of handling CRLF Date: Sun, 6 Jul 1997 20:08:26 -0400 What about binary files with embedded CRLFs? Specifying that a file is binary means specifying no conversion. Therefore, CRLF in these files will not be converted. > Given the new coding-system framework, I think that all file I/O under > DOS_NT should now be done in binary mode since the data that Emacs > gives to the operating system does not need any conversion. If that is the case, how would a user tell Emacs that a file which originally had no CRs should have them added on output (assuming that you agree that such cases are possible)? You can certainly do this by specifying a different coding system when you save the file. From eliz@is.elta.co.il Sun Jul 13 10:59:56 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Sun" "13" "July" "1997" "20:59:41" "+0300" "Eli Zaretskii" "eliz@is.elta.co.il" nil "58" "New way to handle CRLF in Emacs 20.0" "^From:" nil nil "7" nil nil nil nil] nil) Received: from is.elta.co.il (is.elta.co.il [199.203.121.2]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id KAA13469 for ; Sun, 13 Jul 1997 10:59:55 -0700 Received: by is.elta.co.il (SMI-8.6/SMI-SVR4) id UAA28768; Sun, 13 Jul 1997 20:59:41 +0300 X-Sender: eliz@is Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII From: Eli Zaretskii To: Geoff Voelker cc: Richard Stallman , Andrew Innes Subject: New way to handle CRLF in Emacs 20.0 Date: Sun, 13 Jul 1997 20:59:41 +0300 (IDT) > Actually, the new coding-system framework appears to obviate the need > for buffer-file-type; file-coding-system-alist and > buffer-file-coding-system appear to be flexible enough to supercede > it. I will need to think more about this, though, since it is a > rather drastic change under DOS_NT. (Eli and Andrew, if you get a > chance to look at the coding system support, I would like to hear what > you think about doing away with buffer-file-type, too.) Here's what I think, after spending an evening reading the code and playing with Emacs. I also think that the coding system can supercede buffer-file-type. We need to make a list of filename patterns that will automatically guess the coding system given a filename. If a given file is not in the list, Emacs should try to guess its EOL format, like it does now. Since this guess might be wrong (for example, Emacs decides that the file is CRLF-style when it sees the first CRLF pair, and might thus be fooled by a binary file), it would be nice to have options e.g. to ask the user whether the guess is correct, or require more than a single CRLF before a decision is made. (I didn't think about this too much, so I might be wrong.) Emacs should only do the above for filesystems that aren't in the untranslated list (for which all file I/O should be unconverted). I'd like to see user options (other than to tell them set the coding system) to have Emacs write files in specific (CRLF or LF) format. the default behavior of preserving the original EOL encodings seems reasonable. The options would of course just set the coding system, but I'd rather people who need to do this don't have to know too much about coding systems. I also think that the (un)?translated filesystem feature might be useful to Unix users as well. I can imagine NT or even DOS disks mounted via networks, or people who run Linux-based systems and access DOS partitions there (I actually see quite a few complaints from the latter on gnu.emacs.help). These might benefit by adding such disks to translated systems' list and having Emacs handle the conversion. So maybe it's a good idea to move this feature to lisp/files.el? > Currently, the default for file-coding-system-alist is 'undecided. > Under DOS_NT, this should probably be 'emacs-mule so that CRLF is > decoded and encoded by default. I agree. But shouldn't we also set coding.eol_type, for the EOL conversion to take place? I though 'emacs-mule is not enough, no? > file-name-buffer-file-type-alist and the untranslate functions. The > last issue is whether to remove buffer-file-type, but I won't do > anything about that until more people agree that it is no longer > necessary. I think that it can go once the coding system handles everything. We need to decide whether the T: or B: in the modeline is necessary (it seems that the coding system characters show the same information). From eliz@is.elta.co.il Sun Jul 13 11:02:20 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Sun" "13" "July" "1997" "21:02:01" "+0300" "Eli Zaretskii" "eliz@is.elta.co.il" nil "65" "CRLF on DOS_NT" "^From:" nil nil "7" nil nil nil nil] nil) Received: from is.elta.co.il (is.elta.co.il [199.203.121.2]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id LAA13569 for ; Sun, 13 Jul 1997 11:02:19 -0700 Received: by is.elta.co.il (SMI-8.6/SMI-SVR4) id VAA28793; Sun, 13 Jul 1997 21:02:01 +0300 X-Sender: eliz@is Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII From: Eli Zaretskii To: Richard Stallman cc: Geoff Voelker , Andrew Innes Subject: CRLF on DOS_NT Date: Sun, 13 Jul 1997 21:02:01 +0300 (IDT) The following changes are required to make CRLF <-> LF conversion work in most common cases. I have deliberately not tried to get them into final shape, since I need to learn more about the coding systems, and because Geoff said he will work on that. I didn't install these changes, for these reasons (and also because I didn't have enough time to do that today). See also my other mail about the file format translation. (Geoff, the `callproc.c' patch is DOS-specific, since that fragment is for DOS only; you might consider looking up the relevant code for the NT subprocess support.) 1997-07-10 Eli Zaretskii * fileio.c (Fwrite_region) [DOS_NT]: Always use binary mode since coding conversion now takes care of NL -> CRLF. *** src/fileio.c~0 Tue Jul 8 11:36:00 1997 --- src/fileio.c Thu Jul 10 23:16:14 1997 *************** to the file, instead of any buffer conte *** 3799,3806 **** struct gcpro gcpro1, gcpro2, gcpro3, gcpro4, gcpro5; struct buffer *given_buffer; #ifdef DOS_NT ! int buffer_file_type ! = NILP (current_buffer->buffer_file_type) ? O_TEXT : O_BINARY; #endif /* DOS_NT */ struct coding_system coding; --- 3799,3805 ---- struct gcpro gcpro1, gcpro2, gcpro3, gcpro4, gcpro5; struct buffer *given_buffer; #ifdef DOS_NT ! int buffer_file_type = O_BINARY; #endif /* DOS_NT */ struct coding_system coding; 1997-07-11 Eli Zaretskii * callproc.c (Fcall_process) [MSDOS]: Request EOL conversion of the process output, unless we were promised it is binary. *** src/callproc.c~0 Mon Jul 7 00:56:00 1997 --- src/callproc.c Fri Jul 11 21:48:30 1997 *************** If you quit, the process is killed with *** 295,300 **** --- 295,311 ---- val = Qnil; } setup_coding_system (Fcheck_coding_system (val), &process_coding); + #ifdef MSDOS + /* FIXME: this probably should be moved into the guts of + `Ffind_operation_coding_system' for the case of `call-process'. */ + if (NILP (Vbinary_process_output)) + { + process_coding.eol_type = CODING_EOL_CRLF; + if (process_coding.type == coding_type_no_conversion) + /* FIXME: should we set type to undecided? */ + process_coding.type = coding_type_emacs_mule; + } + #endif } } From rms@gnu.ai.mit.edu Sun Jul 13 14:41:06 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Sun" "13" "July" "1997" "17:41:39" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" nil "6" "Re: New way to handle CRLF in Emacs 20.0" "^From:" nil nil "7" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id OAA20208 for ; Sun, 13 Jul 1997 14:41:05 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id RAA25670; Sun, 13 Jul 1997 17:41:39 -0400 Message-Id: <199707132141.RAA25670@psilocin.gnu.ai.mit.edu> In-reply-to: (message from Eli Zaretskii on Sun, 13 Jul 1997 20:59:41 +0300 (IDT)) References: From: Richard Stallman To: eliz@is.elta.co.il CC: voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: New way to handle CRLF in Emacs 20.0 Date: Sun, 13 Jul 1997 17:41:39 -0400 be nice to have options e.g. to ask the user whether the guess is correct, or require more than a single CRLF before a decision is made. (I didn't think about this too much, so I might be wrong.) I think that is not worth the trouble, given that we still have find-file-text and find-file-binary. From rms@gnu.ai.mit.edu Sun Jul 13 14:43:43 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Sun" "13" "July" "1997" "17:44:11" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" nil "9" "Re: New way to handle CRLF in Emacs 20.0" "^From:" nil nil "7" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id OAA20278 for ; Sun, 13 Jul 1997 14:43:42 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id RAA25707; Sun, 13 Jul 1997 17:44:11 -0400 Message-Id: <199707132144.RAA25707@psilocin.gnu.ai.mit.edu> In-reply-to: (message from Eli Zaretskii on Sun, 13 Jul 1997 20:59:41 +0300 (IDT)) References: From: Richard Stallman To: eliz@is.elta.co.il CC: voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: New way to handle CRLF in Emacs 20.0 Date: Sun, 13 Jul 1997 17:44:11 -0400 I also think that the (un)?translated filesystem feature might be useful to Unix users as well. I can imagine NT or even DOS disks mounted via networks, A feature like this could be useful; but some of the present details don't fit this new context. If you are running Emacs on a GNU system, "untranslated" file systems are the usual case; file systems for which new files should be translated are the special case. This is the opposite of the situation for MSDOS. From rms@gnu.ai.mit.edu Sun Jul 13 14:44:49 1997 X-VM-v5-Data: ([nil nil nil nil t nil nil nil nil] [nil "Sun" "13" "July" "1997" "17:45:24" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" "<199707132145.RAA25719@psilocin.gnu.ai.mit.edu>" "16" "Re: New way to handle CRLF in Emacs 20.0" "^From:" nil nil "7" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id OAA20297 for ; Sun, 13 Jul 1997 14:44:49 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id RAA25719; Sun, 13 Jul 1997 17:45:24 -0400 Message-Id: <199707132145.RAA25719@psilocin.gnu.ai.mit.edu> In-reply-to: (message from Eli Zaretskii on Sun, 13 Jul 1997 20:59:41 +0300 (IDT)) References: From: Richard Stallman To: eliz@is.elta.co.il CC: voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: New way to handle CRLF in Emacs 20.0 Date: Sun, 13 Jul 1997 17:45:24 -0400 > Currently, the default for file-coding-system-alist is 'undecided. > Under DOS_NT, this should probably be 'emacs-mule No, definitely not. so that CRLF is > decoded and encoded by default. CRLF encoding is supposed to happen just the same for undecided as it does for emacs-mule. We need to decide whether the T: or B: in the modeline is necessary (it seems that the coding system characters show the same information). Yes, that is something we should decide right now. From rms@gnu.ai.mit.edu Fri Jul 18 20:12:22 1997 X-VM-v5-Data: ([nil nil nil nil t t nil nil nil] [nil "Fri" "18" "July" "1997" "23:13:02" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" "<199707190313.XAA18973@psilocin.gnu.ai.mit.edu>" "71" "Re: CRLF on DOS_NT" "^From:" nil nil "7" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id UAA22532 for ; Fri, 18 Jul 1997 20:12:21 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id XAA18973; Fri, 18 Jul 1997 23:13:02 -0400 Message-Id: <199707190313.XAA18973@psilocin.gnu.ai.mit.edu> In-reply-to: <199707182334.QAA39176@joker.cs.washington.edu> (voelker@cs.washington.edu) References: <199707132042.QAA25113@psilocin.gnu.ai.mit.edu> <199707160153.SAA23676@joker.cs.washington.edu> <199707182305.TAA17659@psilocin.gnu.ai.mit.edu> <199707182334.QAA39176@joker.cs.washington.edu> From: Richard Stallman To: voelker@cs.washington.edu Subject: Re: CRLF on DOS_NT Date: Fri, 18 Jul 1997 23:13:02 -0400 I've always interpreted the semantics of specifying 'nil' (text) in file-name-buffer-file-type-alist as being that you explicitly want CRLF separating lines. For example, no matter what, you want config.sys to have CRLFs between the lines. That is a good point. So here's the change I've made. But I wonder whether emacs-mule-dos is the right coding system in other respects. You've argued that -dos is right, but is emacs-mule right? *** dos-w32.el 1997/07/18 22:54:23 1.6 --- dos-w32.el 1997/07/19 03:10:17 *************** *** 102,128 **** If the match is nil (for text): 'emacs-mule-dos' Otherwise: If the file exists: 'undecided' ! If the file does not exist: 'emacs-mule-dos' If COMMAND is 'write-region', the coding system is chosen based upon the value of 'buffer-file-type': If t, the coding system is 'no-conversion', otherwise it is 'emacs-mule-dos'." (let ((op (nth 0 command)) (target) ! (binary) (undecided nil)) (cond ((eq op 'insert-file-contents) (setq target (nth 1 command)) (setq binary (find-buffer-file-type target)) ! (if (not binary) ! (setq undecided ! (and (file-exists-p target) ! (not (find-buffer-file-type-match target)))))) ((eq op 'write-region) (setq binary buffer-file-type))) (cond (binary '(no-conversion . no-conversion)) (undecided '(undecided . undecided)) ! (t '(emacs-mule-dos . emacs-mule-dos))))) (modify-coding-system-alist 'file "" 'find-buffer-file-type-coding-system) --- 102,129 ---- If the match is nil (for text): 'emacs-mule-dos' Otherwise: If the file exists: 'undecided' ! If the file does not exist: 'undecided-dos' If COMMAND is 'write-region', the coding system is chosen based upon the value of 'buffer-file-type': If t, the coding system is 'no-conversion', otherwise it is 'emacs-mule-dos'." (let ((op (nth 0 command)) (target) ! (binary nil) (text nil) (undecided nil)) (cond ((eq op 'insert-file-contents) (setq target (nth 1 command)) (setq binary (find-buffer-file-type target)) ! (unless binary ! (if (find-buffer-file-type-match target) ! (setq text t) ! (setq undecided (file-exists-p target))))) ((eq op 'write-region) (setq binary buffer-file-type))) (cond (binary '(no-conversion . no-conversion)) + (text '(emacs-mule-dos . emacs-mule-dos)) (undecided '(undecided . undecided)) ! (t '(undecided-dos . undecided-dos))))) (modify-coding-system-alist 'file "" 'find-buffer-file-type-coding-system) From Marc.Fleischeuers@kub.nl Fri Aug 1 01:07:26 1997 X-VM-v5-Data: ([nil nil nil nil t nil nil nil nil] [nil "" " 1" "August" "1997" "10:07:17" "+0200" "Marc Fleischeuers" "Marc.Fleischeuers@kub.nl" "" "69" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from mailnews.kub.nl (mailnews.kub.nl [137.56.0.220]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id BAA21136 for ; Fri, 1 Aug 1997 01:07:24 -0700 Received: from PI0737.kub.nl (pi0737.kub.nl [137.56.38.229]) by mailnews.kub.nl (8.8.5/8.7.1) with SMTP id KAA27228; Fri, 1 Aug 1997 10:07:19 +0200 (MET DST) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> In-Reply-To: Richard Stallman's message of Thu, 31 Jul 1997 19:42:35 -0400 Message-ID: Lines: 69 X-Mailer: Gnus v5.3/Emacs 19.33 From: Marc Fleischeuers Sender: marcf@PI0737.kub.nl To: Richard Stallman Cc: voelker@cs.washington.edu, Marc.Fleischeuers@kub.nl, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: 01 Aug 1997 10:07:17 +0200 > I'm not sure I understand what you are trying to do. When a file is > inside of Emacs, line are always terminated by newlines. The line > termination that exists when the file is in the filesystem is only > placed there when the file is written out. There is no need to > explicitly place CR or LF characters in a file to change the > termination used. > > You're right--but perhaps this can be a clue to finding a place > where the documentation needs to be made clearer. So it is worth figuring > out why Marc got the wrong idea. My intent was too straightforward, obviously. I noticed some problems with msdos-type files (note: these files were not created with emacs, but with other programs most notably netscape). I knew about the cr-lf line ending convention so in an attempt to create a msdos file in emacs, I ended lines with an explicit `C-q Cm C-q C-j'. Please note that this works as expected in emacs 19.33 (i386-*-Win NT 4.0). The variables Geoff mentioned (buffer-file-type and coding-system-for-write) have sent me off on a chase though emacs' help. Skip to the last paragraph if you are not interested in the dead ends. First, `C-h v buffer-file-type' mentions that this is a MS-DOG and Windows NT-only variable, and that it's value is nil. I tried to set the variable with M-x set-variable RET buffer-file-type but when I press return all I get is [no match]. I don't think this is a great loss though, surely with so many advanced encoding and decoding facilities, there is no more need for MSDOG as a special case? On to `coding-system-for-write'. The documentation mentions that this is a variable of internal use only. Setting it would probably require lisp. The appropriate values for this variable should be taken from `coding-system-alist'. There is however no documentation for this variable (`C-h v coding-system-alist' -> [no match]). Still, an internal variable is not the first thing to use if I want to creat an ms-dos file. Apropos'ing around I found another promising variable, `buffer-file-format', valid values for which are found in `format-alist'. In this alist there seems to be an appropriate format, `ibm'. However, `M-x set-variable RET buffer-file-format' again gives [no match]. What I should have used all along was `M-x set-buffer-file-coding-system RET iso-latin-1-dos'. This function is accessible from the menu ([menu-bar mule set-various-coding-systems set-buffer-file-coding-system]) and from the C-x RET keymap. However, it was only from the resulting file that I could see that it was what I wanted (in fact there may still be a better way). The documentation for `set-buffer-file-coding-system' does not mention to what values it can be set, and the description in `M-x describe-coding-system' does not mention what any of the listed coding systems do. In fact, after I selected iso-latin-1-dos, it was described as Current buffer file: buffer-file-coding-system - -- undecided-dos The short answer is the documentation for describe-coding-system and set-*-coding-system could be improved upon. For describe-coding-system, why is it necessary to mention the priority of coding systems? Instead, use the space to explain what the selected coding systems do. For `set-*-coding-systems', it could be mentioned to what values it can be set, and perhaps what they do. Marc -- Computer! End program! Computer! Create _new_ program! From rms@gnu.ai.mit.edu Sat Aug 2 03:18:33 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Sat" " 2" "August" "1997" "06:18:47" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" nil "13" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id DAA24743 for ; Sat, 2 Aug 1997 03:18:33 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id GAA13689; Sat, 2 Aug 1997 06:18:47 -0400 Message-Id: <199708021018.GAA13689@psilocin.gnu.ai.mit.edu> In-reply-to: (message from Marc Fleischeuers on 01 Aug 1997 10:07:17 +0200) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> From: Richard Stallman To: Marc.Fleischeuers@kub.nl cc: voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: Sat, 2 Aug 1997 06:18:47 -0400 What I should have used all along was `M-x set-buffer-file-coding-system RET iso-latin-1-dos'. This function is accessible from the menu ([menu-bar mule set-various-coding-systems set-buffer-file-coding-system]) and from the C-x RET keymap. However, it was only from the resulting file that I could see that it was what I improved the doc of this command. But that won't fully solve the problem. What I really should do is to point you at this command from somewhere else that you would naturally look. Any suggestions for where that could be? From rms@gnu.ai.mit.edu Sat Aug 2 21:23:03 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Sun" " 3" "August" "1997" "00:23:09" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" nil "8" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id VAA20521 for ; Sat, 2 Aug 1997 21:23:03 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id AAA25764; Sun, 3 Aug 1997 00:23:09 -0400 Message-Id: <199708030423.AAA25764@psilocin.gnu.ai.mit.edu> In-reply-to: (message from Marc Fleischeuers on 01 Aug 1997 10:07:17 +0200) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> From: Richard Stallman To: Marc.Fleischeuers@kub.nl CC: voelker@cs.washington.edu, Marc.Fleischeuers@kub.nl, andrewi@harlequin.co.uk, handa@etl.go.jp Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: Sun, 3 Aug 1997 00:23:09 -0400 I knew about the cr-lf line ending convention so in an attempt to create a msdos file in emacs, I ended lines with an explicit `C-q Cm C-q C-j'. Please note that this works as expected in emacs 19.33 (i386-*-Win NT 4.0). Is that really true? What algorithm does 19.33 use for LF to CRLF conversion? Maybe we should change the Emacs 20 EOL conversion to do the same thing. From handa@etl.go.jp Sun Aug 3 18:32:56 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Mon" " 4" "August" "1997" "10:33:49" "+0900" "Kenichi Handa" "handa@etl.go.jp" nil "33" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from mail1-im.etl.go.jp (mail1-im.etl.go.jp [192.50.105.9]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id SAA20790 for ; Sun, 3 Aug 1997 18:32:53 -0700 Received: from etlpom.etl.go.jp (etlpom.etl.go.jp [192.31.200.9]) by mail1-im.etl.go.jp (8.8.5/3.5Wpl1-96112918) with ESMTP id KAA06878; Mon, 4 Aug 1997 10:32:33 +0900 (JST) Received: from etlken.etl.go.jp (etlken.etl.go.jp [192.31.197.11]) by etlpom.etl.go.jp (8.8.5/3.5Wpl4-ETL_MASTER) with SMTP id KAA00812; Mon, 4 Aug 1997 10:32:33 +0900 (JST) Received: by etlken.etl.go.jp (SMI-8.6/6.4J.6-ETL.SLAVE) id KAA04718; Mon, 4 Aug 1997 10:33:49 +0900 Message-Id: <199708040133.KAA04718@etlken.etl.go.jp> In-reply-to: <199708030423.AAA25764@psilocin.gnu.ai.mit.edu> (message from Richard Stallman on Sun, 3 Aug 1997 00:23:09 -0400) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> <199708030423.AAA25764@psilocin.gnu.ai.mit.edu> From: Kenichi Handa To: rms@gnu.ai.mit.edu CC: Marc.Fleischeuers@kub.nl, voelker@cs.washington.edu, Marc.Fleischeuers@kub.nl, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: Mon, 4 Aug 1997 10:33:49 +0900 Date: Sun, 3 Aug 1997 00:23:09 -0400 From: Richard Stallman I knew about the cr-lf line ending convention so in an attempt to create a msdos file in emacs, I ended lines with an explicit `C-q Cm C-q C-j'. Please note that this works as expected in emacs 19.33 (i386-*-Win NT 4.0). Is that really true? What algorithm does 19.33 use for LF to CRLF conversion? Maybe we should change the Emacs 20 EOL conversion to do the same thing. Since the above is the first mail I get about this thread, this reply may fail to catch the point... I don't know why the above doesn't work for Emacs 20. I've just tried the following. 1) At first, visit a new file. 2) type `a b c C-q C-m C-q C-j' 3) save it. 4) visit it again. Then the file is read as `undecided-dos' and the buffer contents are 4-byte of: abc\C-j This means that CR LF is decoded to single LF. But, since buffer-file-coding-system is undecided-dos, when I edit this file and save it, all LFs are encoded back to CR LF. --- Ken'ichi HANDA handa@etl.go.jp From Marc.Fleischeuers@kub.nl Mon Aug 4 01:42:41 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "" " 4" "August" "1997" "10:42:27" "+0200" "Marc Fleischeuers" "Marc.Fleischeuers@kub.nl" nil "25" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from mailnews.kub.nl (mailnews.kub.nl [137.56.0.220]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id BAA03503 for ; Mon, 4 Aug 1997 01:42:36 -0700 Received: from PI0737.kub.nl (pi0737.kub.nl [137.56.38.229]) by mailnews.kub.nl (8.8.5/8.7.1) with SMTP id KAA27167; Mon, 4 Aug 1997 10:42:25 +0200 (MET DST) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> <199708021018.GAA13689@psilocin.gnu.ai.mit.edu> In-Reply-To: Richard Stallman's message of Sat, 2 Aug 1997 06:18:47 -0400 Message-ID: Lines: 25 X-Mailer: Gnus v5.3/Emacs 19.33 From: Marc Fleischeuers Sender: marcf@PI0737.kub.nl To: Richard Stallman Cc: Marc.Fleischeuers@kub.nl, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: 04 Aug 1997 10:42:27 +0200 Richard Stallman writes: > What I really should do is to point you at this command from somewhere > else that you would naturally look. > > Any suggestions for where that could be? The command is in the menu and in the advertised C-x RET keymap; I think anyone inversitgating emacs' new features should find these functions easily (I did't go there straightaway because I first followed the suggestions by Geoff Voelker). In the menu-bar there is even a corresponding `describe' entry for input method and coding systems. I think this is a good thing, this is where I would look if I were a user. In fact I did look there when I first started emacs 20; it's just that the descriptions are not very informative about what the functions actually do for me (input methods do not work (yet?) so I cannot comment on that). If the documentation for set-buffer-file-coding-system, and `M-x describe-coding-system' give information about the available, resp. selected coding systems and what they do for me, I think this should do it. Marc From Marc.Fleischeuers@kub.nl Mon Aug 4 02:40:35 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "" " 4" "August" "1997" "11:40:11" "+0200" "Marc Fleischeuers" "Marc.Fleischeuers@kub.nl" nil "45" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from mailnews.kub.nl (mailnews.kub.nl [137.56.0.220]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id CAA04516 for ; Mon, 4 Aug 1997 02:40:28 -0700 Received: from PI0737.kub.nl (pi0737.kub.nl [137.56.38.229]) by mailnews.kub.nl (8.8.5/8.7.1) with SMTP id LAA00950; Mon, 4 Aug 1997 11:40:09 +0200 (MET DST) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> <199708030423.AAA25764@psilocin.gnu.ai.mit.edu> <199708040133.KAA04718@etlken.etl.go.jp> In-Reply-To: Kenichi Handa's message of Mon, 4 Aug 1997 10:33:49 +0900 Message-ID: Lines: 45 X-Mailer: Gnus v5.3/Emacs 19.33 From: Marc Fleischeuers Sender: marcf@PI0737.kub.nl To: Kenichi Handa Cc: rms@gnu.ai.mit.edu, Marc.Fleischeuers@kub.nl, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: 04 Aug 1997 11:40:11 +0200 Kenichi Handa writes: > I've just tried the following. > 1) At first, visit a new file. > 2) type `a b c C-q C-m C-q C-j' > 3) save it. > 4) visit it again. > > Then the file is read as `undecided-dos' and the buffer contents are > 4-byte of: > abc\C-j > This means that CR LF is decoded to single LF. But, since > buffer-file-coding-system is undecided-dos, when I edit this file and > save it, all LFs are encoded back to CR LF. This is the way it should be, unfortunately it is not for me. I have repeated the four steps above. When I first open a new file, the buffer-file-coding-system is nil and the mode-line indicator is `:'. If I insert `a b c C-q C-m C-q C-j' in the buffer and then save the file, the buffer contains the five bytes `abc\C-m\C-j', buffer-file-coding-system is still nil, and the mode-line indicator is still `:'. With `c:\emacs\bin\hexl abc', the contents of the file is `6162 630d 0d0a'. If I then re-visit the file (`C-x C-v RET') it contains six bytes `abc\C-j\C-j\C-j', buffer-file-coding-system is `- -- undecided-mac', and the mode-line indicator is `/'. I started emacs with `c:\emacs\bin\emacs.bat --no-site-file --no-init-file' The batch file sets a number of environment variables. It is not modified from the one generated by the install process. I use emacs 20.0.92 on Windows NT 4.0, compiled with MS VC++ 4.2. I have also used the following version, started the same way, to perform exactly the same steps: In GNU Emacs 19.33.1 (i386-*-nt4.0) of Wed Aug 14 1996 on BANANA-FISH configured using `configure NT' The file is written and read back in as the five bytes `abc\C-m\C-y'. There is a mode-line indicator `(T:', both when I first open the file and when I read it back in. Marc From handa@etl.go.jp Mon Aug 4 04:43:24 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Mon" " 4" "August" "1997" "20:37:31" "+0900" "Kenichi Handa" "handa@etl.go.jp" nil "69" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from mail1-im.etl.go.jp (mail1-im.etl.go.jp [192.50.105.9]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id EAA06611 for ; Mon, 4 Aug 1997 04:43:23 -0700 Received: from etlpom.etl.go.jp (etlpom.etl.go.jp [192.31.200.9]) by mail1-im.etl.go.jp (8.8.5/3.5Wpl1-96112918) with ESMTP id UAA11557; Mon, 4 Aug 1997 20:36:15 +0900 (JST) Received: from etlken.etl.go.jp (etlken.etl.go.jp [192.31.197.11]) by etlpom.etl.go.jp (8.8.5/3.5Wpl4-ETL_MASTER) with SMTP id UAA07947; Mon, 4 Aug 1997 20:36:15 +0900 (JST) Received: by etlken.etl.go.jp (SMI-8.6/6.4J.6-ETL.SLAVE) id UAA05301; Mon, 4 Aug 1997 20:37:31 +0900 Message-Id: <199708041137.UAA05301@etlken.etl.go.jp> In-reply-to: (message from Marc Fleischeuers on 04 Aug 1997 11:40:11 +0200) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> <199708030423.AAA25764@psilocin.gnu.ai.mit.edu> <199708040133.KAA04718@etlken.etl.go.jp> From: Kenichi Handa To: Marc.Fleischeuers@kub.nl CC: rms@gnu.ai.mit.edu, Marc.Fleischeuers@kub.nl, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: Mon, 4 Aug 1997 20:37:31 +0900 From: Marc Fleischeuers Date: 04 Aug 1997 11:40:11 +0200 Kenichi Handa writes: > I've just tried the following. > 1) At first, visit a new file. > 2) type `a b c C-q C-m C-q C-j' > 3) save it. > 4) visit it again. > > Then the file is read as `undecided-dos' and the buffer contents are > 4-byte of: > abc\C-j > This means that CR LF is decoded to single LF. But, since > buffer-file-coding-system is undecided-dos, when I edit this file and > save it, all LFs are encoded back to CR LF. This is the way it should be, unfortunately it is not for me. I have repeated the four steps above. When I first open a new file, the buffer-file-coding-system is nil and the mode-line indicator is `:'. If I insert `a b c C-q C-m C-q C-j' in the buffer and then save the file, the buffer contains the five bytes `abc\C-m\C-j', buffer-file-coding-system is still nil, and the mode-line indicator is still `:'. With `c:\emacs\bin\hexl abc', the contents of the file is `6162 630d 0d0a'. Hmm, the sequence CR LF was written out as CR CR LF. It seems that the file is opened by O_TEXT instead of O_BINARY. But, this should have been fixed in 20.0.92 already. Strange... Could you please check the file src/fileio.c? Is it applied the following patch made by ? ------------------------------------------------------------ RCS file: RCS/fileio.c,v retrieving revision 1.250 retrieving revision 1.251 diff -u -r1.250 -r1.251 --- fileio.c 1997/07/12 06:43:08 1.250 +++ fileio.c 1997/07/13 20:37:01 1.251 @@ -3799,8 +3799,7 @@ struct gcpro gcpro1, gcpro2, gcpro3, gcpro4, gcpro5; struct buffer *given_buffer; #ifdef DOS_NT - int buffer_file_type - = NILP (current_buffer->buffer_file_type) ? O_TEXT : O_BINARY; + int buffer_file_type = O_BINARY; #endif /* DOS_NT */ struct coding_system coding; ------------------------------------------------------------ If I then re-visit the file (`C-x C-v RET') it contains six bytes `abc\C-j\C-j\C-j', buffer-file-coding-system is `- -- undecided-mac', and the mode-line indicator is `/'. This is an expected behaviour when Emacs reads `a b c CR CR LF'. When Emacs encounters CR not followed by LF, it thinks the end-of-line format for the file is CR (Mac's convention), and translate CR to LF. LF is read as is. So, the buffer contains three LFs. So, the problem seems to be in writing a file. Anyway, I don't have Windows NT. I asked a person who is an expert of Windows to check the code. --- Ken'ichi HANDA handa@etl.go.jp From Marc.Fleischeuers@kub.nl Mon Aug 4 05:21:28 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "" " 4" "August" "1997" "14:18:51" "+0200" "Marc Fleischeuers" "Marc.Fleischeuers@kub.nl" nil "22" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from mailnews.kub.nl (mailnews.kub.nl [137.56.0.220]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id FAA07716 for ; Mon, 4 Aug 1997 05:21:26 -0700 Received: from PI0737.kub.nl (pi0737.kub.nl [137.56.38.229]) by mailnews.kub.nl (8.8.5/8.7.1) with SMTP id OAA10741; Mon, 4 Aug 1997 14:18:49 +0200 (MET DST) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> <19 <199708041137.UAA05301@etlken.etl.go.jp> In-Reply-To: Kenichi Handa's message of Mon, 4 Aug 1997 20:37:31 +0900 Message-ID: Lines: 22 X-Mailer: Gnus v5.3/Emacs 19.33 From: Marc Fleischeuers Sender: marcf@PI0737.kub.nl To: Kenichi Handa Cc: Marc.Fleischeuers@kub.nl, rms@gnu.ai.mit.edu, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: 04 Aug 1997 14:18:51 +0200 Kenichi Handa writes: > Hmm, the sequence CR LF was written out as CR CR LF. It seems that > the file is opened by O_TEXT instead of O_BINARY. But, this should > have been fixed in 20.0.92 already. Strange... > > Could you please check the file src/fileio.c? Is it applied the > following patch made by ? This patch is applied (that is, it says ``int buffer_file_type = O_BINARY'', that's what it should be I think) > So, the problem seems to be in writing a file. In a previous post today, I have described how both 19.33 and 20.0.92 both write the same bytes to disk. If the way in which this file is read in is correct (as I understand from your and RMS' posts) then a) the way the cr and lf sequences are interpreted differs between 19.34 and 20.0.92, and b) this difference in reading, is indeed not matched by an appropriate difference in writing. From handa@etl.go.jp Mon Aug 4 05:54:20 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Mon" " 4" "August" "1997" "21:54:40" "+0900" "Kenichi Handa" "handa@etl.go.jp" nil "33" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from mail1-im.etl.go.jp (mail1-im.etl.go.jp [192.50.105.9]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id FAA08453 for ; Mon, 4 Aug 1997 05:54:19 -0700 Received: from etlpom.etl.go.jp (etlpom.etl.go.jp [192.31.200.9]) by mail1-im.etl.go.jp (8.8.5/3.5Wpl1-96112918) with ESMTP id VAA13372; Mon, 4 Aug 1997 21:53:24 +0900 (JST) Received: from etlken.etl.go.jp (etlken.etl.go.jp [192.31.197.11]) by etlpom.etl.go.jp (8.8.5/3.5Wpl4-ETL_MASTER) with SMTP id VAA09960; Mon, 4 Aug 1997 21:53:25 +0900 (JST) Received: by etlken.etl.go.jp (SMI-8.6/6.4J.6-ETL.SLAVE) id VAA05378; Mon, 4 Aug 1997 21:54:40 +0900 Message-Id: <199708041254.VAA05378@etlken.etl.go.jp> In-reply-to: (message from Marc Fleischeuers on 04 Aug 1997 14:18:51 +0200) From: Kenichi Handa To: Marc.Fleischeuers@kub.nl CC: rms@gnu.ai.mit.edu, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: Mon, 4 Aug 1997 21:54:40 +0900 From: Marc Fleischeuers Date: 04 Aug 1997 14:18:51 +0200 > Could you please check the file src/fileio.c? Is it applied the > following patch made by ? This patch is applied (that is, it says ``int buffer_file_type = O_BINARY'', that's what it should be I think) I have just found that lisp/dos-w23.el is doing something about deciding coding system. Although I have not yet read the code in detail, I suspect that the code decides that coding system for writing a file on NT is undecided-dos. If it is true, it explains everything, because Emacs writes CR as is and converts LF to CR LF when it writes a file by undecided-dos. Perhaps, Mr. Voelker wrote this code so that NT users don't have to do special thing to create a DOS file. In your case, you don't have to insert \C-m by hand to creat a DOS file. Please just try the followings: 1) visit a new file 2) type `a b c RET' 3) save the file. 4) visit the file again. You should be able to create a file of `a b c CR LF' by step 3, and buffer-file-coding-system is set to undecided-dos by step 4. Mr. Voelker? Is this correct? --- Ken'ichi HANDA handa@etl.go.jp From Marc.Fleischeuers@kub.nl Mon Aug 4 05:59:03 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "" " 4" "August" "1997" "14:58:42" "+0200" "Marc Fleischeuers" "Marc.Fleischeuers@kub.nl" nil "33" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from mailnews.kub.nl (mailnews.kub.nl [137.56.0.220]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id FAA08582 for ; Mon, 4 Aug 1997 05:58:53 -0700 Received: from PI0737.kub.nl (pi0737.kub.nl [137.56.38.229]) by mailnews.kub.nl (8.8.5/8.7.1) with SMTP id OAA14494; Mon, 4 Aug 1997 14:58:40 +0200 (MET DST) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> <19 <199708041137.UAA05301@etlken.etl.go.jp> In-Reply-To: Marc Fleischeuers's message of 04 Aug 1997 14:18:51 +0200 Message-ID: Lines: 33 X-Mailer: Gnus v5.3/Emacs 19.33 From: Marc Fleischeuers Sender: marcf@PI0737.kub.nl To: Marc Fleischeuers Cc: Kenichi Handa , rms@gnu.ai.mit.edu, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: 04 Aug 1997 14:58:42 +0200 Marc Fleischeuers writes: > Kenichi Handa writes: > > > Hmm, the sequence CR LF was written out as CR CR LF. It seems that > > the file is opened by O_TEXT instead of O_BINARY. But, this should > > have been fixed in 20.0.92 already. Strange... > > > > Could you please check the file src/fileio.c? Is it applied the > > following patch made by ? > > This patch is applied (that is, it says ``int buffer_file_type = > O_BINARY'', that's what it should be I think) > > > So, the problem seems to be in writing a file. I have examined the value of the lisp-variable `buffer-file-type' in several stages after reading and writing files containing cr and lf sequences. The value of this variable was always nil, indicating a text (i.e., non-binary) file. In buffer-file-type-alist it is set that files with extension '.tpu' are interpreted as binary, so I tried C-x C-f new.tpu a b c C-q C-m C-q C-j C-x C-s C-x C-v RET This file is created containing the intended 5 bytes, and it is read back in "correctly" (buffer contains `abc^M'). After reading the file in, the mode line indicator is `=:', and buffer-file-coding-system is `= -- no-conversion (alias: binary)'. May I argue that the concept of `buffer-file-type' and its associated variables and functions are removed from emacs 20? Marc From Marc.Fleischeuers@kub.nl Mon Aug 4 06:36:36 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "" " 4" "August" "1997" "15:36:27" "+0200" "Marc Fleischeuers" "Marc.Fleischeuers@kub.nl" nil "14" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from mailnews.kub.nl (mailnews.kub.nl [137.56.0.220]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id GAA10461 for ; Mon, 4 Aug 1997 06:36:35 -0700 Received: from PI0737.kub.nl (pi0737.kub.nl [137.56.38.229]) by mailnews.kub.nl (8.8.5/8.7.1) with SMTP id PAA16644; Mon, 4 Aug 1997 15:36:25 +0200 (MET DST) References: <199708041254.VAA05378@etlken.etl.go.jp> In-Reply-To: Kenichi Handa's message of Mon, 4 Aug 1997 21:54:40 +0900 Message-ID: Lines: 14 X-Mailer: Gnus v5.3/Emacs 19.33 From: Marc Fleischeuers Sender: marcf@PI0737.kub.nl To: Kenichi Handa Cc: Marc.Fleischeuers@kub.nl, rms@gnu.ai.mit.edu, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: 04 Aug 1997 15:36:27 +0200 Kenichi Handa writes: > I have just found that lisp/dos-w23.el is doing something about > deciding coding system. Although I have not yet read the code in > detail, I suspect that the code decides that coding system for writing > a file on NT is undecided-dos. If it is true, it explains everything, I was reading there too. It appears that emacs does a lot of thinking for me. I have done some light testing, and it looks like everything acts like I expect it to, when (untranslated-file-p filename) returns t. This is what I'll be doing for a while, until something else breaks.. Marc From rms@gnu.ai.mit.edu Mon Aug 4 12:46:38 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Mon" " 4" "August" "1997" "15:46:48" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" nil "19" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id MAA03249 for ; Mon, 4 Aug 1997 12:46:37 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id PAA22721; Mon, 4 Aug 1997 15:46:48 -0400 Message-Id: <199708041946.PAA22721@psilocin.gnu.ai.mit.edu> In-reply-to: (message from Marc Fleischeuers on 04 Aug 1997 14:18:51 +0200) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> <19 <199708041137.UAA05301@etlken.etl.go.jp> From: Richard Stallman To: Marc.Fleischeuers@kub.nl CC: handa@etl.go.jp, Marc.Fleischeuers@kub.nl, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: Mon, 4 Aug 1997 15:46:48 -0400 > Hmm, the sequence CR LF was written out as CR CR LF. It seems that > the file is opened by O_TEXT instead of O_BINARY. I would expect this is because of the usual DOS eol conversion. and not because of O_TEXT. This patch is applied (that is, it says ``int buffer_file_type = O_BINARY'', that's what it should be I think) I am not surprised. The same code in Emacs that converts just LF to CR LF will of course do so when the preceding character is a CR-- unless there is something special to stop it. As far as I know, there is nothing special to avoid encoding LF as CR LF based on the preceding character. Handa, is there? From rms@gnu.ai.mit.edu Mon Aug 4 12:55:40 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Mon" " 4" "August" "1997" "15:55:46" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" nil "11" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id MAA03818 for ; Mon, 4 Aug 1997 12:55:39 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id PAA22807; Mon, 4 Aug 1997 15:55:46 -0400 Message-Id: <199708041955.PAA22807@psilocin.gnu.ai.mit.edu> In-reply-to: <199708041254.VAA05378@etlken.etl.go.jp> (message from Kenichi Handa on Mon, 4 Aug 1997 21:54:40 +0900) References: <199708041254.VAA05378@etlken.etl.go.jp> From: Richard Stallman To: handa@etl.go.jp CC: Marc.Fleischeuers@kub.nl, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: Mon, 4 Aug 1997 15:55:46 -0400 If it is true, it explains everything, because Emacs writes CR as is and converts LF to CR LF when it writes a file by undecided-dos. Yes, of course. I've been telling both of you this over and over. You've been trying to unravel a mystery which is not a mystery at all. The real question is, should we put in a special feature to override that behavior when the buffer contains a CR? Should DOS-style EOL conversion recognize when the buffer contains CR LF, and output it as CR LF (rather than CR CR LF)? From rms@gnu.ai.mit.edu Mon Aug 4 13:05:08 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Mon" " 4" "August" "1997" "16:05:23" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" nil "18" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id NAA04450 for ; Mon, 4 Aug 1997 13:05:07 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id QAA22977; Mon, 4 Aug 1997 16:05:23 -0400 Message-Id: <199708042005.QAA22977@psilocin.gnu.ai.mit.edu> In-reply-to: (message from Marc Fleischeuers on 04 Aug 1997 15:36:27 +0200) References: <199708041254.VAA05378@etlken.etl.go.jp> From: Richard Stallman To: Marc.Fleischeuers@kub.nl CC: handa@etl.go.jp, Marc.Fleischeuers@kub.nl, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: Mon, 4 Aug 1997 16:05:23 -0400 I was reading there too. It appears that emacs does a lot of thinking for me. I have done some light testing, and it looks like everything acts like I expect it to, when (untranslated-file-p filename) returns t. That sentence is ambiguous; it could mean there is a no problem, or it could mean there is a serious problem. untranslated-file-p is supposed to return t only when the file resides on a file system that is mounted on a Unix-like system. That is an unusual case for an MSDOS user; therefore, it is not the really important case. The really important case is when untranslated-file-p returns nil. So let's focus on the most important question first: what happens when untranslated-file-p returns nil? Do you get correct behavior in all cases? If not, can you tell us precisely which case is still incorrect? From handa@etl.go.jp Mon Aug 4 17:28:58 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Tue" " 5" "August" "1997" "09:29:44" "+0900" "Kenichi Handa" "handa@etl.go.jp" nil "34" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from mail1-im.etl.go.jp (mail1-im.etl.go.jp [192.50.105.9]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id RAA20984 for ; Mon, 4 Aug 1997 17:28:57 -0700 Received: from etlpom.etl.go.jp (etlpom.etl.go.jp [192.31.200.9]) by mail1-im.etl.go.jp (8.8.5/3.5Wpl1-96112918) with ESMTP id JAA25767; Tue, 5 Aug 1997 09:28:29 +0900 (JST) Received: from etlken.etl.go.jp (etlken.etl.go.jp [192.31.197.11]) by etlpom.etl.go.jp (8.8.5/3.5Wpl4-ETL_MASTER) with SMTP id JAA29964; Tue, 5 Aug 1997 09:28:29 +0900 (JST) Received: by etlken.etl.go.jp (SMI-8.6/6.4J.6-ETL.SLAVE) id JAA05934; Tue, 5 Aug 1997 09:29:44 +0900 Message-Id: <199708050029.JAA05934@etlken.etl.go.jp> In-reply-to: <199708041946.PAA22721@psilocin.gnu.ai.mit.edu> (message from Richard Stallman on Mon, 4 Aug 1997 15:46:48 -0400) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> <19 <199708041137.UAA05301@etlken.etl.go.jp> <199708041946.PAA22721@psilocin.gnu.ai.mit.edu> From: Kenichi Handa To: rms@gnu.ai.mit.edu CC: Marc.Fleischeuers@kub.nl, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: Tue, 5 Aug 1997 09:29:44 +0900 Date: Mon, 4 Aug 1997 15:46:48 -0400 From: Richard Stallman The same code in Emacs that converts just LF to CR LF will of course do so when the preceding character is a CR-- unless there is something special to stop it. As far as I know, there is nothing special to avoid encoding LF as CR LF based on the preceding character. Handa, is there? You are right. I didn't wrote such a special code. If it is true, it explains everything, because Emacs writes CR as is and converts LF to CR LF when it writes a file by undecided-dos. Yes, of course. I've been telling both of you this over and over. You've been trying to unravel a mystery which is not a mystery at all. Please note that I joined this discussion from halfway. The real question is, should we put in a special feature to override that behavior when the buffer contains a CR? Should DOS-style EOL conversion recognize when the buffer contains CR LF, and output it as CR LF (rather than CR CR LF)? I don't like it because it's too kluge (can I use this word as an adjective?). But, if DOS users want it, it's not that difficult to implement it. --- Ken'ichi HANDA handa@etl.go.jp From rms@gnu.ai.mit.edu Mon Aug 4 23:31:17 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Tue" " 5" "August" "1997" "02:30:07" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" nil "24" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id XAA03522 for ; Mon, 4 Aug 1997 23:31:16 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id CAA31410; Tue, 5 Aug 1997 02:30:07 -0400 Message-Id: <199708050630.CAA31410@psilocin.gnu.ai.mit.edu> In-reply-to: (message from Marc Fleischeuers on 04 Aug 1997 11:40:11 +0200) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> <199708030423.AAA25764@psilocin.gnu.ai.mit.edu> <199708040133.KAA04718@etlken.etl.go.jp> From: Richard Stallman To: Marc.Fleischeuers@kub.nl CC: handa@etl.go.jp, Marc.Fleischeuers@kub.nl, voelker@cs.washington.edu, andrewi@harlequin.co.uk, rms@gnu.ai.mit.edu Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: Tue, 5 Aug 1997 02:30:07 -0400 When I first open a new file, the buffer-file-coding-system is nil and the mode-line indicator is `:'. If I insert `a b c C-q C-m C-q C-j' in the buffer and then save the file, the buffer contains the five bytes `abc\C-m\C-j', buffer-file-coding-system is still nil, and the mode-line indicator is still `:'. With `c:\emacs\bin\hexl abc', the contents of the file is `6162 630d 0d0a'. This is the right behavior, as Emacs is currently designed. It may not be quite the best feature, but it is not a bug. If I then re-visit the file (`C-x C-v RET') it contains six bytes `abc\C-j\C-j\C-j', buffer-file-coding-system is `- -- undecided-mac', and the mode-line indicator is `/'. This is a bug. The presence of CR CR LF in the file should not cause mac EOL conversion to be used. I think that Emacs is being too quick to use mac EOL conversion. I suspect that right now any CR not followed by LF does this. If the file contains a LF anywhere near the beginning, then mac EOL conversion should not be used. Handa, can you fix this? From rms@gnu.ai.mit.edu Mon Aug 4 23:35:32 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Tue" " 5" "August" "1997" "02:31:37" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" nil "18" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id XAA03608 for ; Mon, 4 Aug 1997 23:35:32 -0700 Received: by psilocin.gnu.ai.mit.edu (8.8.5/8.6.12GNU) id CAA31418; Tue, 5 Aug 1997 02:31:37 -0400 Message-Id: <199708050631.CAA31418@psilocin.gnu.ai.mit.edu> In-reply-to: (message from Marc Fleischeuers on 04 Aug 1997 11:40:11 +0200) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> <199708030423.AAA25764@psilocin.gnu.ai.mit.edu> <199708040133.KAA04718@etlken.etl.go.jp> From: Richard Stallman To: Marc.Fleischeuers@kub.nl CC: handa@etl.go.jp, Marc.Fleischeuers@kub.nl, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: Tue, 5 Aug 1997 02:31:37 -0400 I have also used the following version, started the same way, to perform exactly the same steps: In GNU Emacs 19.33.1 (i386-*-nt4.0) of Wed Aug 14 1996 on BANANA-FISH configured using `configure NT' The file is written and read back in as the five bytes `abc\C-m\C-y'. There is a mode-line indicator `(T:', both when I first open the file and when I read it back in. If you write the file out and then visit it again, you are performing two experiments in series and you are telling only the result of the two of them. That isn't really useful. You need to tell us the result of each experiment. In other words, what exactly is in the file when you write it with Emacs 19 in this way? Is it a b c CR CR LF or a b c CR LF or what? From handa@etl.go.jp Tue Aug 5 01:10:29 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Tue" " 5" "August" "1997" "17:10:48" "+0900" "Kenichi Handa" "handa@etl.go.jp" nil "30" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from mail1-im.etl.go.jp (mail1-im.etl.go.jp [192.50.105.9]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id BAA06992 for ; Tue, 5 Aug 1997 01:10:27 -0700 Received: from etlpom.etl.go.jp (etlpom.etl.go.jp [192.31.200.9]) by mail1-im.etl.go.jp (8.8.5/3.5Wpl1-96112918) with ESMTP id RAA23247; Tue, 5 Aug 1997 17:09:34 +0900 (JST) Received: from etlken.etl.go.jp (etlken.etl.go.jp [192.31.197.11]) by etlpom.etl.go.jp (8.8.5/3.5Wpl4-ETL_MASTER) with SMTP id RAA26427; Tue, 5 Aug 1997 17:09:34 +0900 (JST) Received: by etlken.etl.go.jp (SMI-8.6/6.4J.6-ETL.SLAVE) id RAA06812; Tue, 5 Aug 1997 17:10:48 +0900 Message-Id: <199708050810.RAA06812@etlken.etl.go.jp> In-reply-to: <199708050630.CAA31410@psilocin.gnu.ai.mit.edu> (message from Richard Stallman on Tue, 5 Aug 1997 02:30:07 -0400) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> <199708030423.AAA25764@psilocin.gnu.ai.mit.edu> <199708040133.KAA04718@etlken.etl.go.jp> <199708050630.CAA31410@psilocin.gnu.ai.mit.edu> From: Kenichi Handa To: rms@gnu.ai.mit.edu CC: Marc.Fleischeuers@kub.nl, Marc.Fleischeuers@kub.nl, voelker@cs.washington.edu, andrewi@harlequin.co.uk, rms@gnu.ai.mit.edu Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: Tue, 5 Aug 1997 17:10:48 +0900 Richard Stallman writes: > If I then re-visit the file (`C-x C-v RET') it contains six bytes > `abc\C-j\C-j\C-j', buffer-file-coding-system is `- -- undecided-mac', > and the mode-line indicator is `/'. > This is a bug. The presence of CR CR LF in the file > should not cause mac EOL conversion to be used. > I think that Emacs is being too quick to use mac EOL conversion. > I suspect that right now any CR not followed by LF does this. Right. > If the file contains a LF anywhere near the beginning, > then mac EOL conversion should not be used. > Handa, can you fix this? Yes. But how about LF CR LF or CR LF LF? Should they be recognized as DOS format or Unix format? Hmmm, how about accumulating how many times each possible end-of-line format appears, and select the one which first occurs 3 times? If none occurs 3 times, perhaps we should select the one occurs last. Then, CR CR LF -> DOS LF CR LF -> DOS CR LF LF -> Unix CR LF CR -> Mac --- Ken'ichi HANDA handa@etl.go.jp From Marc.Fleischeuers@kub.nl Tue Aug 5 01:11:15 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "" " 5" "August" "1997" "10:11:00" "+0200" "Marc Fleischeuers" "Marc.Fleischeuers@kub.nl" nil "42" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from mailnews.kub.nl (mailnews.kub.nl [137.56.0.220]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id BAA07028 for ; Tue, 5 Aug 1997 01:11:09 -0700 Received: from PI0737.kub.nl (pi0737.kub.nl [137.56.38.229]) by mailnews.kub.nl (8.8.5/8.7.1) with SMTP id KAA22959; Tue, 5 Aug 1997 10:11:00 +0200 (MET DST) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> <199708030423.AAA25764@psilocin.gnu.ai.mit.edu> <199708040133.KAA04718@etlken.etl.go.jp> <199708050631.CAA31418@psilocin.gnu.ai.mit.edu> In-Reply-To: Richard Stallman's message of Tue, 5 Aug 1997 02:31:37 -0400 Message-ID: Lines: 42 X-Mailer: Gnus v5.3/Emacs 19.33 From: Marc Fleischeuers Sender: marcf@PI0737.kub.nl To: Richard Stallman Cc: Marc.Fleischeuers@kub.nl, handa@etl.go.jp, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: 05 Aug 1997 10:11:00 +0200 Richard Stallman writes: > If you write the file out and then visit it again, > you are performing two experiments in series > and you are telling only the result of the two of them. > > That isn't really useful. You need to tell us the result > of each experiment. The emacsen are on different machines, hence I can safely use the same pathnames. GNU Emacs 19.33.1 (i386-*-nt4.0) GNU Emacs 20.0.92.1 (i386-*-nt4.0) started with: started with: C:\> c:\emacs\bin\emacs.bat -nw C:\> c:\emacs\bin\emacs.bat -nw --no-site-file --no-init-file --no-site-file --no-init-file Input: Input: C-x C-f t . t RET a b c C-q RET RET C-x C-f t . t RET a b c C-q RET RET d e f C-q RET RET C-x C-s d e f C-q RET RET C-x C-s Buffer looks like: Buffer looks like: abc^M abc^M def^M def^M File contents: File contents: 6162 630d 0d0a 6465 660d 0d0a 6162 630d 0d0a 6465 660d 0d0a Input: Input: C-x C-v RET C-x C-v RET Buffer looks like: Buffer looks like: abc^M abc def^M ------- def ------- Marc From Marc.Fleischeuers@kub.nl Tue Aug 5 01:23:15 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "" " 5" "August" "1997" "10:23:02" "+0200" "Marc Fleischeuers" "Marc.Fleischeuers@kub.nl" nil "14" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from mailnews.kub.nl (mailnews.kub.nl [137.56.0.220]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with ESMTP id BAA07239 for ; Tue, 5 Aug 1997 01:23:14 -0700 Received: from PI0737.kub.nl (pi0737.kub.nl [137.56.38.229]) by mailnews.kub.nl (8.8.5/8.7.1) with SMTP id KAA23774; Tue, 5 Aug 1997 10:23:02 +0200 (MET DST) References: <199707291709.NAA14978@psilocin.gnu.ai.mit.edu> <199707310605.XAA15156@joker.cs.washington.edu> <199707312038.NAA15222@joker.cs.washington.edu> <199707312342.TAA19727@psilocin.gnu.ai.mit.edu> <199708030423.AAA25764@psilocin.gnu.ai.mit.edu> <199708040133.KAA04718@etlken.etl.go.jp> <199708050630.CAA31410@psilocin.gnu.ai.mit.edu> <199708050810.RAA06812@etlken.etl.go.jp> In-Reply-To: Kenichi Handa's message of Tue, 5 Aug 1997 17:10:48 +0900 Message-ID: Lines: 14 X-Mailer: Gnus v5.3/Emacs 19.33 From: Marc Fleischeuers Sender: marcf@PI0737.kub.nl To: Kenichi Handa Cc: rms@gnu.ai.mit.edu, Marc.Fleischeuers@kub.nl, voelker@cs.washington.edu, andrewi@harlequin.co.uk Subject: Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf] Date: 05 Aug 1997 10:23:02 +0200 Kenichi Handa writes: > Yes. But how about LF CR LF or CR LF LF? Should they be recognized > as DOS format or Unix format? And what about CR CR LF LF? LF CR CR LF? Yes I'm joking. However, after spending two days chasing a bug eventually finding myself outwitted by emacs' intelligence in dos-w32.el, I tend to think that emacs should not be too smart. If the distribution of CR and LF throughout the file do not form a clear pattern, would `no conversion' be an option? Marc From rms@gnu.ai.mit.edu Tue Aug 5 01:39:26 1997 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] [nil "Tue" " 5" "August" "1997" "04:38:02" "-0400" "Richard Stallman" "rms@gnu.ai.mit.edu" nil "22" "Re: [Marc.Fleischeuers@kub.nl: Emacs 20.0.92 on Windows NT 4.0: error converting cr-lf]" "^From:" nil nil "8" nil nil nil nil] nil) Received: from psilocin.gnu.ai.mit.edu (psilocin.gnu.ai.mit.edu [128.52.46.62]) by june.cs.washington