Voici une solution légèrement différente. Au lieu d'utiliser plusieurs lignes, je me suis concentré sur la dernière ligne, et au lieu d'avoir une taille de bloc constante, j'ai une taille de bloc dynamique (qui double). Voir les commentaires pour plus d'informations.
# Get last line of a text file using seek method. Works with non-constant block size.
# IDK if that speed things up, but it's good enough for us,
# especially with constant line lengths in the file (provided by len_guess),
# in which case the block size doubling is not performed much if at all. Currently,
# we're using this on a textfile format with constant line lengths.
# Requires that the file is opened up in binary mode. No nonzero end-rel seeks in text mode.
REL_FILE_END = 2
def lastTextFileLine(file, len_guess=1):
file.seek(-1, REL_FILE_END) # 1 => go back to position 0; -1 => 1 char back from end of file
text = file.read(1)
tot_sz = 1 # store total size so we know where to seek to next rel file end
if text != b'\n': # if newline is the last character, we want the text right before it
file.seek(0, REL_FILE_END) # else, consider the text all the way at the end (after last newline)
tot_sz = 0
blocks = [] # For storing succesive search blocks, so that we don't end up searching in the already searched
j = file.tell() # j = end pos
not_done = True
block_sz = len_guess
while not_done:
if j < block_sz: # in case our block doubling takes us past the start of the file (here j also = length of file remainder)
block_sz = j
not_done = False
tot_sz += block_sz
file.seek(-tot_sz, REL_FILE_END) # Yes, seek() works with negative numbers for seeking backward from file end
text = file.read(block_sz)
i = text.rfind(b'\n')
if i != -1:
text = text[i+1:].join(reversed(blocks))
return str(text)
else:
blocks.append(text)
block_sz <<= 1 # double block size (converge with open ended binary search-like strategy)
j = j - block_sz # if this doesn't work, try using tmp j1 = file.tell() above
return str(b''.join(reversed(blocks))) # if newline was never found, return everything read
Idéalement, vous devriez intégrer cette fonction dans la classe LastTextFileLine et suivre une moyenne mobile de la longueur des lignes. Cela vous donnerait une bonne idée de len_guess.