Bug #1542

Data type mismatches in downloaded files

Added by drbits almost 11 years ago. Updated almost 11 years ago.

Status:NewStart date:04/06/2010
Priority:NormalDue date:
Assignee:-% Done:


Category:GeneralEstimated time:3.00 hours
Target version:040 - FarfarAway


The file extension, file signature, and content_type should all match (not all will always be available).

There are millions of file types, but less than 25 common file types. About 20 binary types for media and archives, and three general formatted text types (HTML, XHTML, and the XML family). Most of these file types can be recognized in the first 8 bytes of a file, but in the case of the text files, there may be useless (to us) information at the front, so up to 128 bytes may be read to get the signature. The signatures are already known for use by JD unrar.

The most important distinction is between web site data (HTML, XHTML, PNG, JPEG, GIF) and expected file data.
Currently, HTML data from Rapidshare is being saved into rar files.

Refer http://board.jdownloader.org/showthread.php?t=15948
Target release is the next release after the release expected during the next month.

Also available in: Atom PDF