EOL differences

Main development forum.

EOL differences

Postby kimmov » Tue May 12, 2009 11:02 pm

Me and Matthias have been discussing about EOL differences in many tracker items. I think its better to discuss in one forum thread. Easier to follow.

As everybody knows there are three EOL types:
  • DOS/Windows, CR + LF (bytes 0x0d + 0x0a)
  • Unix/Linux/Mac Os X, LF (byte 0x0a)
  • Classic Mac OS, CR (byte 0x0d)
Wikipedia article lists few more EOL styles for Unicode files. We don't yet support those, but probably should in future.

In common case, every line in the file has same EOL type. Any of those three. Then there is the rare case of file having two or more EOL styles. EOL style can be thought of two ways - EOL is part of the line, and as such the EOL bytes are part of the line data. Or that EOL is file's property. Traditionally we've been thinking in first way - every line has EOL bytes. But I want to change ti to latter, EOL style is file's property. This makes lots of things easier to implement and more natural for the user.

From user point of view there are two things users want to do with files in WinMerge file compare:
  • view differences
  • merge differences
and in both operations current EOL differencing system is in their way, pretty annoyingly.

The simplest case of files having same EOL style can be skipped as it is not interested in this discussion. When files have different EOL styles things get interesting.

1. Diffing files. When user opens two files with different EOL styles, what should happen? What the user expects to see? User may or may not knwo the files have different EOL style. But does one really care about that if one wants to see the differences? I can't think of a situation where EOL difference would be interesting info for every line. Line content is interesting, not EOL.

So I think we should just somehow inform user that files have different EOL style (simple messagebox) and then just "forget" this in regards of differences in files. We can show (as we do currently) the EOL styles in status bar and that is enough information. There is no situation (like there currently is) we show whole file as different because of EOL differences.

We won't even highlight the difference. As the difference is again meaningless for the user. It is very obvious every line has different EOL, so there is no any additional info in highlighting EOL bytes. Quite contrary, highlighting EOL bytes makes it look like there is some difference that could be merged. But there is no.

Shortly, file compare will behave like we had EOL ignore on by default always. Just that we show the actual style in statusbar. And (possible) message when files with different EOL styles are opened. Though even this message might be overkill. Perhaps it can be optional feature.

2. Merging differences. When user wants to copy difference from file to another, things are a bit more complex. What does user want to copy? User wants to copy line data (text), not EOL bytes. There is very simple rule: WinMerge must not break user's files! If we copy another EOL styles to file, creating mixed-EOL files we seriously break user's file!

When copying diff (lines) we copy the line data from file to another. From original file we take the line data without EOL bytes and add lines to target file with target file's EOL style.
kimmov
 
Posts: 562
Joined: Thu Sep 11, 2008 8:51 pm
Location: Finland

Re: EOL differences

Postby kimmov » Wed May 13, 2009 6:01 pm

I didn't mention one special case in my first post: files with mixed EOL-style. These files should not exist, since it is always error that file has more than one EOL style. But in practice these kind of files sometimes happen due to buggy / incapable tools. So WinMerge must be able to handle those files aswell.

So far we have handled these files as "normal" files, we've added special code to handle per-line EOL styles. We've thouhght "so somebody gave us broken file, lets just behave like it is ok and try to deal with it". So instead of trying to fix the broken file for the user, we just let the user suffer from the error. Some other tool might break with the file, or mess the file even more.

I think the only sane thing to do while finding more than one EOL style in one file is show a big warning for the user. Telling about the situatio and asking simply which EOL style to use for the whole file. And fix the EOL style.

I cannot think of any situation somebody wants to keep the file broken. I think users are happy when WinMerge fixes errors of other tools.
kimmov
 
Posts: 562
Joined: Thu Sep 11, 2008 8:51 pm
Location: Finland

Re: EOL differences

Postby matthias1955 » Wed May 13, 2009 9:05 pm

1. Diffing files.
a) yes, we have allready that message box. just make the standart to show again.
That's really enough.
b) As I remember WM is also asking to change the EOL_STYLE_MIXED to EOL_DOS/WIN.

2. Merging differences
Don't worry about the style.
We can copy a diffent EOL_Style from left to right and mix the styles. But at the moment we store the file,
WM is using same style of original file. Only on mixed, it keeps the style from other view (from single lines!).
matthias1955
 
Posts: 162
Joined: Wed Dec 17, 2008 1:55 pm

Re: EOL differences

Postby kimmov » Wed May 13, 2009 10:04 pm

matthias1955 wrote:1. Diffing files.
a) yes, we have allready that message box. just make the standart to show again.
That's really enough.

My point is we should only inform user. We shouldn't ask anything. Currently people answer wrong since they don't know what they should answer and we get those bug reports about all lines being marked different. I've replied way too many such reports - always being user doing wrong selection. So that GUI simply is wrong.

matthias1955 wrote:b) As I remember WM is also asking to change the EOL_STYLE_MIXED to EOL_DOS/WIN.

And that dialog is the worst part of WinMerge GUI. Nobody can understand the message correctly. Several people have tried to improve the text and it is currently barely understandable if you are a geek. But normal users have no clue about it.

matthias1955 wrote:2. Merging differences
Don't worry about the style.

I worry. I worry about users files and data.

matthias1955 wrote:We can copy a diffent EOL_Style from left to right and mix the styles.

We absolutely won't copy styles from side to side. Like I wrote. That would create broken files.
kimmov
 
Posts: 562
Joined: Thu Sep 11, 2008 8:51 pm
Location: Finland

Re: EOL differences

Postby matthias1955 » Thu May 14, 2009 9:20 pm

kimmov wrote
We absolutely won't copy styles from side to side. Like I wrote. That would create broken files.


I think you missunderstand. We can copy between the views, while storing we keep our original style.
So we never create a broken one. :D
Only on mixed we keep EOL style from new line. 8-)
matthias1955
 
Posts: 162
Joined: Wed Dec 17, 2008 1:55 pm

Re: EOL differences

Postby kimmov » Thu May 14, 2009 9:46 pm

Copying EOL bytes is time and resources wasted. We only need to copy line content (without EOL bytes). And add EOL bytes when needed (write to disk, copy to clipboard etc).

I'm currently refactoring editor code to get rid of unnecessary EOL byte handling. So far looks like it simplifies some operations quite a bit...

matthias1955 wrote:Only on mixed we keep EOL style from new line.

What do you mean?
kimmov
 
Posts: 562
Joined: Thu Sep 11, 2008 8:51 pm
Location: Finland

Re: EOL differences

Postby matthias1955 » Fri May 15, 2009 7:41 pm

in case you have a mixed left, Win on right.
You can copy some lines with diff EOL from left to right. Now you would expect mixed also on right.
But, it isn't. After storing you still have a plain WIN-Style on right, no mixed.
Same happend between UNIX and WIN.
Only when having mixed on both views, the style from each line , copied from other view, will be stored into file.
matthias1955
 
Posts: 162
Joined: Wed Dec 17, 2008 1:55 pm

Re: EOL differences

Postby kimmov » Fri May 15, 2009 9:12 pm

matthias1955 wrote:in case you have a mixed left, Win on right.
You can copy some lines with diff EOL from left to right. Now you would expect mixed also on right.

Weird expectation. Open two files with different EOL styles to some sane editor (like Notepad++). Copy lines between those two files. The EOL style of the files don't change. Sane editors correctly think EOL style as file's property.

We made a mistake in WinMerge years ago and tried to be too smart. But I'm fixing this...

matthias1955 wrote:But, it isn't. After storing you still have a plain WIN-Style on right, no mixed.
Same happend between UNIX and WIN.

Yes, because WinMerge protects user from breaking files.

matthias1955 wrote:Only when having mixed on both views, the style from each line , copied from other view, will be stored into file.

And this what I'll remove later. We won't be editing mixed files in future.
kimmov
 
Posts: 562
Joined: Thu Sep 11, 2008 8:51 pm
Location: Finland

Re: EOL differences

Postby matthias1955 » Thu May 21, 2009 2:37 pm

does it also means, we do not show EOL_STYLE diff in our view anymore?

I' found a good text for mixed styles.
Open a mixed EOL_style file with your VS-IDE. There will come a good message, we can use in WM also.
Not complicated, and easy to understand.
We can just add read manual topic... MIXED_EOL_STYLE for more info.

'the EOL in this files are not consitend.
Do you like to normalise th EOL?'
file:...%path%
MS adds here a listbox with our siutiable EOL_STYLES for selection to convert.
matthias1955
 
Posts: 162
Joined: Wed Dec 17, 2008 1:55 pm

Re: EOL differences

Postby kimmov » Thu May 21, 2009 4:03 pm

We better not to copy MS programs so directly.
kimmov
 
Posts: 562
Joined: Thu Sep 11, 2008 8:51 pm
Location: Finland

Next

Return to Developers

Who is online

Users browsing this forum: No registered users and 2 guests

cron