mirror of
				https://github.com/RetroDECK/Duckstation.git
				synced 2025-04-10 19:15:14 +00:00 
			
		
		
		
	
		
			
	
	
		
			329 lines
		
	
	
		
			9.8 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
		
		
			
		
	
	
			329 lines
		
	
	
		
			9.8 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
|   | LZMA compression | ||
|  | ---------------- | ||
|  | Version: 9.35 | ||
|  | 
 | ||
|  | This file describes LZMA encoding and decoding functions written in C language. | ||
|  | 
 | ||
|  | LZMA is an improved version of famous LZ77 compression algorithm.  | ||
|  | It was improved in way of maximum increasing of compression ratio, | ||
|  | keeping high decompression speed and low memory requirements for  | ||
|  | decompressing. | ||
|  | 
 | ||
|  | Note: you can read also LZMA Specification (lzma-specification.txt from LZMA SDK) | ||
|  | 
 | ||
|  | Also you can look source code for LZMA encoding and decoding: | ||
|  |   C/Util/Lzma/LzmaUtil.c | ||
|  | 
 | ||
|  | 
 | ||
|  | LZMA compressed file format | ||
|  | --------------------------- | ||
|  | Offset Size Description | ||
|  |   0     1   Special LZMA properties (lc,lp, pb in encoded form) | ||
|  |   1     4   Dictionary size (little endian) | ||
|  |   5     8   Uncompressed size (little endian). -1 means unknown size | ||
|  |  13         Compressed data | ||
|  | 
 | ||
|  | 
 | ||
|  | 
 | ||
|  | ANSI-C LZMA Decoder | ||
|  | ~~~~~~~~~~~~~~~~~~~ | ||
|  | 
 | ||
|  | Please note that interfaces for ANSI-C code were changed in LZMA SDK 4.58. | ||
|  | If you want to use old interfaces you can download previous version of LZMA SDK | ||
|  | from sourceforge.net site. | ||
|  | 
 | ||
|  | To use ANSI-C LZMA Decoder you need the following files: | ||
|  | 1) LzmaDec.h + LzmaDec.c + 7zTypes.h + Precomp.h + Compiler.h | ||
|  | 
 | ||
|  | Look example code: | ||
|  |   C/Util/Lzma/LzmaUtil.c | ||
|  | 
 | ||
|  | 
 | ||
|  | Memory requirements for LZMA decoding | ||
|  | ------------------------------------- | ||
|  | 
 | ||
|  | Stack usage of LZMA decoding function for local variables is not  | ||
|  | larger than 200-400 bytes. | ||
|  | 
 | ||
|  | LZMA Decoder uses dictionary buffer and internal state structure. | ||
|  | Internal state structure consumes | ||
|  |   state_size = (4 + (1.5 << (lc + lp))) KB | ||
|  | by default (lc=3, lp=0), state_size = 16 KB. | ||
|  | 
 | ||
|  | 
 | ||
|  | How To decompress data | ||
|  | ---------------------- | ||
|  | 
 | ||
|  | LZMA Decoder (ANSI-C version) now supports 2 interfaces: | ||
|  | 1) Single-call Decompressing | ||
|  | 2) Multi-call State Decompressing (zlib-like interface) | ||
|  | 
 | ||
|  | You must use external allocator: | ||
|  | Example: | ||
|  | void *SzAlloc(void *p, size_t size) { p = p; return malloc(size); } | ||
|  | void SzFree(void *p, void *address) { p = p; free(address); } | ||
|  | ISzAlloc alloc = { SzAlloc, SzFree }; | ||
|  | 
 | ||
|  | You can use p = p; operator to disable compiler warnings. | ||
|  | 
 | ||
|  | 
 | ||
|  | Single-call Decompressing | ||
|  | ------------------------- | ||
|  | When to use: RAM->RAM decompressing | ||
|  | Compile files: LzmaDec.h + LzmaDec.c + 7zTypes.h | ||
|  | Compile defines: no defines | ||
|  | Memory Requirements: | ||
|  |   - Input buffer: compressed size | ||
|  |   - Output buffer: uncompressed size | ||
|  |   - LZMA Internal Structures: state_size (16 KB for default settings)  | ||
|  | 
 | ||
|  | Interface: | ||
|  |   int LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen, | ||
|  |       const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode,  | ||
|  |       ELzmaStatus *status, ISzAlloc *alloc); | ||
|  |   In:  | ||
|  |     dest     - output data | ||
|  |     destLen  - output data size | ||
|  |     src      - input data | ||
|  |     srcLen   - input data size | ||
|  |     propData - LZMA properties  (5 bytes) | ||
|  |     propSize - size of propData buffer (5 bytes) | ||
|  |     finishMode - It has meaning only if the decoding reaches output limit (*destLen). | ||
|  |          LZMA_FINISH_ANY - Decode just destLen bytes. | ||
|  |          LZMA_FINISH_END - Stream must be finished after (*destLen). | ||
|  |                            You can use LZMA_FINISH_END, when you know that  | ||
|  |                            current output buffer covers last bytes of stream.  | ||
|  |     alloc    - Memory allocator. | ||
|  | 
 | ||
|  |   Out:  | ||
|  |     destLen  - processed output size  | ||
|  |     srcLen   - processed input size  | ||
|  | 
 | ||
|  |   Output: | ||
|  |     SZ_OK | ||
|  |       status: | ||
|  |         LZMA_STATUS_FINISHED_WITH_MARK | ||
|  |         LZMA_STATUS_NOT_FINISHED  | ||
|  |         LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK | ||
|  |     SZ_ERROR_DATA - Data error | ||
|  |     SZ_ERROR_MEM  - Memory allocation error | ||
|  |     SZ_ERROR_UNSUPPORTED - Unsupported properties | ||
|  |     SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src). | ||
|  | 
 | ||
|  |   If LZMA decoder sees end_marker before reaching output limit, it returns OK result, | ||
|  |   and output value of destLen will be less than output buffer size limit. | ||
|  | 
 | ||
|  |   You can use multiple checks to test data integrity after full decompression: | ||
|  |     1) Check Result and "status" variable. | ||
|  |     2) Check that output(destLen) = uncompressedSize, if you know real uncompressedSize. | ||
|  |     3) Check that output(srcLen) = compressedSize, if you know real compressedSize.  | ||
|  |        You must use correct finish mode in that case. */  | ||
|  | 
 | ||
|  | 
 | ||
|  | Multi-call State Decompressing (zlib-like interface) | ||
|  | ---------------------------------------------------- | ||
|  | 
 | ||
|  | When to use: file->file decompressing  | ||
|  | Compile files: LzmaDec.h + LzmaDec.c + 7zTypes.h | ||
|  | 
 | ||
|  | Memory Requirements: | ||
|  |  - Buffer for input stream: any size (for example, 16 KB) | ||
|  |  - Buffer for output stream: any size (for example, 16 KB) | ||
|  |  - LZMA Internal Structures: state_size (16 KB for default settings)  | ||
|  |  - LZMA dictionary (dictionary size is encoded in LZMA properties header) | ||
|  | 
 | ||
|  | 1) read LZMA properties (5 bytes) and uncompressed size (8 bytes, little-endian) to header: | ||
|  |    unsigned char header[LZMA_PROPS_SIZE + 8]; | ||
|  |    ReadFile(inFile, header, sizeof(header) | ||
|  | 
 | ||
|  | 2) Allocate CLzmaDec structures (state + dictionary) using LZMA properties | ||
|  | 
 | ||
|  |   CLzmaDec state; | ||
|  |   LzmaDec_Constr(&state); | ||
|  |   res = LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc); | ||
|  |   if (res != SZ_OK) | ||
|  |     return res; | ||
|  | 
 | ||
|  | 3) Init LzmaDec structure before any new LZMA stream. And call LzmaDec_DecodeToBuf in loop | ||
|  | 
 | ||
|  |   LzmaDec_Init(&state); | ||
|  |   for (;;) | ||
|  |   { | ||
|  |     ...  | ||
|  |     int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen,  | ||
|  |         const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode); | ||
|  |     ... | ||
|  |   } | ||
|  | 
 | ||
|  | 
 | ||
|  | 4) Free all allocated structures | ||
|  |   LzmaDec_Free(&state, &g_Alloc); | ||
|  | 
 | ||
|  | Look example code: | ||
|  |   C/Util/Lzma/LzmaUtil.c | ||
|  | 
 | ||
|  | 
 | ||
|  | How To compress data | ||
|  | -------------------- | ||
|  | 
 | ||
|  | Compile files:  | ||
|  |   7zTypes.h | ||
|  |   Threads.h	 | ||
|  |   LzmaEnc.h | ||
|  |   LzmaEnc.c | ||
|  |   LzFind.h | ||
|  |   LzFind.c | ||
|  |   LzFindMt.h | ||
|  |   LzFindMt.c | ||
|  |   LzHash.h | ||
|  | 
 | ||
|  | Memory Requirements: | ||
|  |   - (dictSize * 11.5 + 6 MB) + state_size | ||
|  | 
 | ||
|  | Lzma Encoder can use two memory allocators: | ||
|  | 1) alloc - for small arrays. | ||
|  | 2) allocBig - for big arrays. | ||
|  | 
 | ||
|  | For example, you can use Large RAM Pages (2 MB) in allocBig allocator for  | ||
|  | better compression speed. Note that Windows has bad implementation for  | ||
|  | Large RAM Pages.  | ||
|  | It's OK to use same allocator for alloc and allocBig. | ||
|  | 
 | ||
|  | 
 | ||
|  | Single-call Compression with callbacks | ||
|  | -------------------------------------- | ||
|  | 
 | ||
|  | Look example code: | ||
|  |   C/Util/Lzma/LzmaUtil.c | ||
|  | 
 | ||
|  | When to use: file->file compressing  | ||
|  | 
 | ||
|  | 1) you must implement callback structures for interfaces: | ||
|  | ISeqInStream | ||
|  | ISeqOutStream | ||
|  | ICompressProgress | ||
|  | ISzAlloc | ||
|  | 
 | ||
|  | static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); } | ||
|  | static void SzFree(void *p, void *address) {  p = p; MyFree(address); } | ||
|  | static ISzAlloc g_Alloc = { SzAlloc, SzFree }; | ||
|  | 
 | ||
|  |   CFileSeqInStream inStream; | ||
|  |   CFileSeqOutStream outStream; | ||
|  | 
 | ||
|  |   inStream.funcTable.Read = MyRead; | ||
|  |   inStream.file = inFile; | ||
|  |   outStream.funcTable.Write = MyWrite; | ||
|  |   outStream.file = outFile; | ||
|  | 
 | ||
|  | 
 | ||
|  | 2) Create CLzmaEncHandle object; | ||
|  | 
 | ||
|  |   CLzmaEncHandle enc; | ||
|  | 
 | ||
|  |   enc = LzmaEnc_Create(&g_Alloc); | ||
|  |   if (enc == 0) | ||
|  |     return SZ_ERROR_MEM; | ||
|  | 
 | ||
|  | 
 | ||
|  | 3) initialize CLzmaEncProps properties; | ||
|  | 
 | ||
|  |   LzmaEncProps_Init(&props); | ||
|  | 
 | ||
|  |   Then you can change some properties in that structure. | ||
|  | 
 | ||
|  | 4) Send LZMA properties to LZMA Encoder | ||
|  | 
 | ||
|  |   res = LzmaEnc_SetProps(enc, &props); | ||
|  | 
 | ||
|  | 5) Write encoded properties to header | ||
|  | 
 | ||
|  |     Byte header[LZMA_PROPS_SIZE + 8]; | ||
|  |     size_t headerSize = LZMA_PROPS_SIZE; | ||
|  |     UInt64 fileSize; | ||
|  |     int i; | ||
|  | 
 | ||
|  |     res = LzmaEnc_WriteProperties(enc, header, &headerSize); | ||
|  |     fileSize = MyGetFileLength(inFile); | ||
|  |     for (i = 0; i < 8; i++) | ||
|  |       header[headerSize++] = (Byte)(fileSize >> (8 * i)); | ||
|  |     MyWriteFileAndCheck(outFile, header, headerSize) | ||
|  | 
 | ||
|  | 6) Call encoding function: | ||
|  |       res = LzmaEnc_Encode(enc, &outStream.funcTable, &inStream.funcTable,  | ||
|  |         NULL, &g_Alloc, &g_Alloc); | ||
|  | 
 | ||
|  | 7) Destroy LZMA Encoder Object | ||
|  |   LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc); | ||
|  | 
 | ||
|  | 
 | ||
|  | If callback function return some error code, LzmaEnc_Encode also returns that code | ||
|  | or it can return the code like SZ_ERROR_READ, SZ_ERROR_WRITE or SZ_ERROR_PROGRESS. | ||
|  | 
 | ||
|  | 
 | ||
|  | Single-call RAM->RAM Compression | ||
|  | -------------------------------- | ||
|  | 
 | ||
|  | Single-call RAM->RAM Compression is similar to Compression with callbacks, | ||
|  | but you provide pointers to buffers instead of pointers to stream callbacks: | ||
|  | 
 | ||
|  | SRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen, | ||
|  |     const CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark,  | ||
|  |     ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig); | ||
|  | 
 | ||
|  | Return code: | ||
|  |   SZ_OK               - OK | ||
|  |   SZ_ERROR_MEM        - Memory allocation error  | ||
|  |   SZ_ERROR_PARAM      - Incorrect paramater | ||
|  |   SZ_ERROR_OUTPUT_EOF - output buffer overflow | ||
|  |   SZ_ERROR_THREAD     - errors in multithreading functions (only for Mt version) | ||
|  | 
 | ||
|  | 
 | ||
|  | 
 | ||
|  | Defines | ||
|  | ------- | ||
|  | 
 | ||
|  | _LZMA_SIZE_OPT - Enable some optimizations in LZMA Decoder to get smaller executable code. | ||
|  | 
 | ||
|  | _LZMA_PROB32   - It can increase the speed on some 32-bit CPUs, but memory usage for  | ||
|  |                  some structures will be doubled in that case. | ||
|  | 
 | ||
|  | _LZMA_UINT32_IS_ULONG  - Define it if int is 16-bit on your compiler and long is 32-bit. | ||
|  | 
 | ||
|  | _LZMA_NO_SYSTEM_SIZE_T  - Define it if you don't want to use size_t type. | ||
|  | 
 | ||
|  | 
 | ||
|  | _7ZIP_PPMD_SUPPPORT - Define it if you don't want to support PPMD method in AMSI-C .7z decoder. | ||
|  | 
 | ||
|  | 
 | ||
|  | C++ LZMA Encoder/Decoder  | ||
|  | ~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|  | C++ LZMA code use COM-like interfaces. So if you want to use it,  | ||
|  | you can study basics of COM/OLE. | ||
|  | C++ LZMA code is just wrapper over ANSI-C code. | ||
|  | 
 | ||
|  | 
 | ||
|  | C++ Notes | ||
|  | ~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|  | If you use some C++ code folders in 7-Zip (for example, C++ code for .7z handling), | ||
|  | you must check that you correctly work with "new" operator. | ||
|  | 7-Zip can be compiled with MSVC 6.0 that doesn't throw "exception" from "new" operator. | ||
|  | So 7-Zip uses "CPP\Common\NewHandler.cpp" that redefines "new" operator: | ||
|  | operator new(size_t size) | ||
|  | { | ||
|  |   void *p = ::malloc(size); | ||
|  |   if (p == 0) | ||
|  |     throw CNewException(); | ||
|  |   return p; | ||
|  | } | ||
|  | If you use MSCV that throws exception for "new" operator, you can compile without  | ||
|  | "NewHandler.cpp". So standard exception will be used. Actually some code of  | ||
|  | 7-Zip catches any exception in internal code and converts it to HRESULT code. | ||
|  | So you don't need to catch CNewException, if you call COM interfaces of 7-Zip. | ||
|  | 
 | ||
|  | --- | ||
|  | 
 | ||
|  | http://www.7-zip.org | ||
|  | http://www.7-zip.org/sdk.html | ||
|  | http://www.7-zip.org/support.html |