InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI.git synced 2025-01-08 11:57:36 +08:00

Author	SHA1	Message	Date
Ryan Dick	20acfc9a00	Raise in CustomEmbedding and CustomGroupNorm if a patch is applied.	2024-12-28 20:49:17 +00:00
Ryan Dick	918f541af8	Add unit test for a SetParameterLayer patch applied to a CustomFluxRMSNorm layer.	2024-12-28 20:44:48 +00:00
Ryan Dick	93e76b61d6	Add CustomFluxRMSNorm layer.	2024-12-28 20:33:38 +00:00
Ryan Dick	f692e217ea	Add patch support to CustomConv1d and CustomConv2d (no unit tests yet).	2024-12-27 22:23:17 +00:00
Ryan Dick	f2981979f9	Get custom layer patches working with all quantized linear layer types.	2024-12-27 22:00:22 +00:00
Ryan Dick	ef970a1cdc	Add support for FluxControlLoRALayer in CustomLinear layers and add a unit test for it.	2024-12-27 21:00:47 +00:00
Ryan Dick	5ee7405f97	Add more unit tests for custom module LoRA patching: multiple LoRAs and ConcatenatedLoRALayers.	2024-12-27 19:47:21 +00:00
Ryan Dick	e24e386a27	Add support for patches to CustomModuleMixin and add a single unit test (more to come).	2024-12-27 18:57:13 +00:00
Ryan Dick	b06d61e3c0	Improve custom layer wrap/unwrap logic.	2024-12-27 16:29:48 +00:00
Ryan Dick	7d6ab0ceb2	Add a CustomModuleMixin class with a flag for enabling/disabling autocasting (since it incurs some runtime speed overhead.)	2024-12-26 20:08:30 +00:00
Ryan Dick	9692a36dd6	Use a fixture to parameterize tests in test_all_custom_modules.py so that a fresh instance of the layer under test is initialized for each test.	2024-12-26 19:41:25 +00:00
Ryan Dick	b0b699a01f	Add unit test to test that isinstance(...) behaves as expected with custom module types.	2024-12-26 18:45:56 +00:00
Ryan Dick	a8b2c4c3d2	Add inference tests for all custom module types (i.e. to test autocasting from cpu to device).	2024-12-26 18:33:46 +00:00
Ryan Dick	03944191db	Split test_autocast_modules.py into separate test files to mirror the source file structure.	2024-12-24 22:29:11 +00:00
Ryan Dick	987c9ae076	Move custom autocast modules to separate files in a custom_modules/ directory.	2024-12-24 22:21:31 +00:00
Ryan Dick	6d7314ac0a	Consolidate the LayerPatching patching modes into a single implementation.	2024-12-24 15:57:54 +00:00
Ryan Dick	80db9537ff	Rename model_patcher.py -> layer_patcher.py.	2024-12-24 15:57:54 +00:00
Ryan Dick	6f926f05b0	Update apply_smart_model_patches() so that layer restore matches the behavior of non-smart mode.	2024-12-24 15:57:54 +00:00
Ryan Dick	61253b91f1	Enable LoRAPatcher.apply_smart_lora_patches(...) throughout the stack.	2024-12-24 15:57:54 +00:00
Ryan Dick	0148512038	(minor) Rename num_layers -> num_loras in unit tests.	2024-12-24 15:57:54 +00:00
Ryan Dick	d0f35fceed	Add test_apply_smart_lora_patches_to_partially_loaded_model(...).	2024-12-24 15:57:54 +00:00
Ryan Dick	cefcb340d9	Add LoRAPatcher.smart_apply_lora_patches()	2024-12-24 15:57:54 +00:00
Ryan Dick	0fc538734b	Skip flaky test when running on Github Actions, and further reduce peak unit test memory.	2024-12-24 14:32:11 +00:00
Ryan Dick	7214d4969b	Workaround a weird quirk of QuantState.to() and add a unit test to exercise it.	2024-12-24 14:32:11 +00:00
Ryan Dick	a83a999b79	Reduce peak memory used for unit tests.	2024-12-24 14:32:11 +00:00
Ryan Dick	f8a6accf8a	Fix bitsandbytes imports to avoid ImportErrors on MacOS.	2024-12-24 14:32:11 +00:00
Ryan Dick	f8ab414f99	Add CachedModelOnlyFullLoad to mirror the CachedModelWithPartialLoad for models that cannot or should not be partially loaded.	2024-12-24 14:32:11 +00:00
Ryan Dick	c6795a1b47	Make CachedModelWithPartialLoad work with models that have non-persistent buffers.	2024-12-24 14:32:11 +00:00
Ryan Dick	0a8fc74ae9	Add CachedModelWithPartialLoad to manage partially-loaded models using the new autocast modules.	2024-12-24 14:32:11 +00:00
Ryan Dick	dc54e8763b	Add CustomInvokeLinearNF4 to enable CPU -> GPU streaming for InvokeLinearNF4 layers.	2024-12-24 14:32:11 +00:00
Ryan Dick	1b56020876	Add CustomInvokeLinear8bitLt layer for device streaming with InvokeLinear8bitLt layers.	2024-12-24 14:32:11 +00:00
Ryan Dick	3f990393a1	Simplify the state management in InvokeLinear8bitLt and add unit tests. This is in preparation for wrapping it to support streaming of weights from cpu to gpu.	2024-12-24 14:32:11 +00:00
Ryan Dick	97d56f7dc9	Add torch module autocast unit test for GGUF-quantized models.	2024-12-24 14:32:11 +00:00
Ryan Dick	fe0ef2c27c	Add torch module autocast utilities.	2024-12-24 14:32:11 +00:00
Ryan Dick	65fcbf5f60	Bump bitsandbytes. The new verson contains improvements to state_dict loading/saving for LLM.int8 and promises improved speed on some HW.	2024-12-24 14:32:11 +00:00
Ryan Dick	d3916dbdb6	Partial Loading PR1: Tidy ModelCache (#7492 ) ## Summary This PR tidies up the model cache code in preparation for further refactoring to support partial loading of models onto the GPU. These code changes should not change the functional behavior in any way. Changes: - Remove the `ModelCacheBase` class. `ModelCache` is the only implementation, so there is no benefit to the separate abstract class. - Split `CacheRecord` and `CacheStats` out into their own files. - Remove the `ModelLocker` class. This extra layer of indirection was not providing any benefit. Locking is now done directly with the `ModelCache`. - Tidy up relative imports that were contributing to circular import issues. - Pull the 'submodel' concern out of the `ModelCache`. The `ModelCache` should not need to be aware of the model manager submodel system. - Delete unused properties from the `ModelCache` (e.g. `.lazy_offloading`, `.storage_device`, etc.) ## QA Instructions I ran smoke tests with a variety of SD1, SDXL and FLUX models. No change to behavior is expected. ## Merge Plan <!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like DB schemas, may need some care when merging. For example, a careful rebase by the change author, timing to not interfere with a pending release, or a message to contributors on discord after merging.--> ## Checklist - [x] _The PR has a short but descriptive title, suitable for a changelog_ - [x] _Tests added / updated (if applicable)_ - [x] _Documentation added / updated (if applicable)_ - [ ] _Updated `What's New` copy (if doing a release after this PR)_	2024-12-24 09:30:44 -05:00
Ryan Dick	55b13c1da3	(minor) Add TODO comment regarding the location of get_model_cache_key().	2024-12-24 14:23:19 +00:00
Ryan Dick	7dc3e0fdbe	Get rid of ModelLocker. It was an unnecessary layer of indirection.	2024-12-24 14:23:18 +00:00
Ryan Dick	a39bcf7e85	Move lock(...) and unlock(...) logic from ModelLocker to the ModelCache and make a bunch of ModelCache properties/methods private.	2024-12-24 14:23:18 +00:00
Ryan Dick	a7c72992a6	Pull get_model_cache_key(...) out of ModelCache. The ModelCache should not be concerned with implementation details like the submodel_type.	2024-12-24 14:23:18 +00:00
Ryan Dick	d30a9ced38	Rename model_cache_default.py -> model_cache.py.	2024-12-24 14:23:18 +00:00
Ryan Dick	e0bfa6157b	Remove ModelCacheBase.	2024-12-24 14:23:18 +00:00
Ryan Dick	83ea6420e2	Move CacheStats to its own file.	2024-12-24 14:23:18 +00:00
Ryan Dick	ce11a1952e	Move CacheRecord out to its own file.	2024-12-24 14:23:18 +00:00
Ryan Dick	e48dee4c4a	Rip out ModelLockerBase.	2024-12-24 14:23:18 +00:00
Simon Fuhrmann	712674b6dd	Add Stereogram Nodes to communityNodes.md	2024-12-23 13:51:53 -05:00
psychedelicious	de0043f443	docs: update download links for launcher	2024-12-23 13:23:14 +11:00
Riku	d21506da6f	feat(ci): add typegen check workflow	2024-12-22 06:05:17 +11:00
psychedelicious	a49894901a	docs: fix installation docs home again	2024-12-20 17:35:50 +11:00
psychedelicious	e7e26c8a93	docs: fix installation docs home	2024-12-20 17:12:44 +11:00

1 2 3 4 5 ...

15189 Commits