Data type mismatches in downloaded files
|Category:||General||Estimated time:||3.00 hours|
|Target version:||040 - FarfarAway|
The file extension, file signature, and content_type should all match (not all will always be available).
There are millions of file types, but less than 25 common file types. About 20 binary types for media and archives, and three general formatted text types (HTML, XHTML, and the XML family). Most of these file types can be recognized in the first 8 bytes of a file, but in the case of the text files, there may be useless (to us) information at the front, so up to 128 bytes may be read to get the signature. The signatures are already known for use by JD unrar.
The most important distinction is between web site data (HTML, XHTML, PNG, JPEG, GIF) and expected file data.
Currently, HTML data from Rapidshare is being saved into rar files.
Target release is the next release after the release expected during the next month.