Reverse Engineering of Nodelist DIFFs ...
[28.4.2007] On doing some recovery of single files
or parts of the Fido-History-Project archive, i'd running into a problem were 2 full lists exists
with all except 1 diffs in sequence.
In detail:
In 1993 Region 24 has been reorganized in Nodelist #173/1993. The changes hasn't been reflected
in the Region Pointlist from beginning, so the distribution with the new structure starts after
a few weeks.
In the meanwhile R24 has became a split into R24 Classic (the group that patched the official
Nodelist with the Classic R24 structure before #173/1993) and R24 Light.
Also the Pointlist has became a split into a R24 Classic distribution and a R24 Light
distribution.
As of system crashes, defective tapes, lost floppys, the archive of old R24 Pointlists
was reduced to about 6 Pointlists after #173/1993, and again subsequent R24 Pointlists
starting #049/1994. Between the #351 and #365/1993 and #049/1994 List there are
subsequent Pr24Diffs starting #007/1994 available.
So it comes to an idea, to reverse engineer the DIFF processing backwards
from Points24.049/1994 back to Points24.365/1993 and fill the difference
with the information from Points24.351/1993.
Theory
Assumption: i have a source FTN List and a FTN Diff file.
List #014 | Diff #021 | | List #021 |
A | C1 Copy 1 | => | A |
| A1 Add 1 | | |
| Q | => | Q |
B | C1 Copy 1 | => | B |
C | D1 Delete 1 | X | |
D | C1 Copy 1 | => | D |
A source List with records A,B,C + D becomes after the Diff process A, Q, B + D
Reverse engineering:
In reverse order, there is a possible chance to re-build from List #021 and Diff #021 a
List #014 if i can relate in a 2nd step to a list before (#007):
List #021 | Diff #021 | | List #014 |
A | C1 Copy 1 | => | A |
Q | A1 Add 1 (reverse: Delete 1) | X | |
| Q skip ADDs | | |
B | C1 Copy 1 | => | B |
| D1 Delete 1 (reverse: Add 1) | => | ? |
D | C1 Copy 1 | => | D |
A source List with records A, Q, B + D becomes after the reverse Diff process A,B,? + D
COPY works for the DIFF in both directions.
An ADD command means, that a record in the old list doesn't exist
and a new record will be inserted into the newer List.
From the view of the new list, this record doesn't exist before.
So the ADD command works as a DELETE command in reverse direction.
Now to the DELETE command:
The delete command becomes in reverse mode an Add command.
A record that exists before, has to be deleted. In reverse
engineering this record has to be rebuild. As i don't know
the content, that has been deleted before, i'm using a
placeholder.
The 2nd alignment process:
If i have a list one prior (#007) before the list that have to be
recovered (#014), i can compare the older list (#007) record by record
with the recovered list (#014) and can replace each placeholder
with the content that relates in the older list (#007).
List #007 | | List #014 |
A | | A |
B | | B |
C | =>> | ? |
D | | D |
A recovered List with records A,B,? + D becomes after the Lists alignment process A,B,C + D
Known Problems:
If a record is added to the recoverable list and has been deleted later in the
diff sequence that is used to get the recoverable list, no infos about the
original content of that line exists (not in list prior to the recoverable list,
not in the diffs, nor in the starting list for recovery).
Such a problem also occure if a record has been deleted and re-added
(applying changes with later delete in the diff sequence).
One goodie:
All such problems can be detected by the CRC check, but this prevents
to successfuly recover the list.
LIST | Prior List #007 | List to recover #014 | 1st DIFF #021 | 2nd DIFF #028 | 3rd DIFF #035 | Starting List #035 |
Real Content | ,1,Point1,.. | ,1,Sysop,.. | ,1,Sysop,.. | ,1,CoSysop,.. | ,1,CoSysop,.. | ,1,CoSysop,.. |
Diff cmd | | D1 A1 ,1,Sysop,.. | C1 | D1 A1 ,1,CoSysop,.. | C1 | C1 |
Known by Rev.Eng process | ,1,Point1,.. | D1 -> (,1,Point1,..) A1 ? | ? | D1 -> ? A1 ,1,CoSysop,.. | ,1,CoSysop,.. | ,1,CoSysop,.. |
Problems in practice:
By the reverse Recovery try of Points24.365/1993 from Points24.049/1994 with Diffs:
# 007, 014, 021, 028, 035, 042, 049 / 1994
i've successfuly recovered Points24.042/1994 (!), CRC check OK :-)
But, one list before, Points24.035 results in an error.
There are 40 possible mismatches of Bossnode segments with 1 to 70 mismatches of
Pointlistings each. Lines that has been replaced or removed are such possible mismatches.
To resolve this problem, all entries have to be checked, if one of these lines has an
ADD or CHANGE reference in one of the DIFFs # 007, 014, 021, 028, 035, 042
so these lines becomes to be no longer possible mismatches.
By visible compare and check i can exclude the first seven Bossnode segments
of seven checked segments from the mismatch table, because their lines are referenced
in one of the DIFFs #007-042.
Checking step by step each referenced entry reduces the possible mitmatch lines,
that can be possibly fixed in a later step to get a correct CRC value for Points24.035.
One note: Sources of Net 244 Bossnode Pointlist Segments sent for Points24.365 helps me,
to find some correct lines listings. Other mismatches, i have to figure out.
to be continued ...
'cause its a bit time consuming ... :-/
May 2007: giving up.
|
|
|