Compare commits
No commits in common. "749e22335e9065811e3e253dd6e8909c558bd022" and "822c90599b23b1db656146d426aa1ff8b5d27ed5" have entirely different histories.
749e22335e
...
822c90599b
|
@ -1,21 +0,0 @@
|
||||||
# LLavaTagger
|
|
||||||
|
|
||||||
LLavaTagger is a python script that tags images based on a given prompt using the [LLaVA](https://llava-vl.github.io/) multi modal llm. LLavaTagger supports using any number of gpus in ddp parralel for this task.
|
|
||||||
|
|
||||||
## How to use
|
|
||||||
|
|
||||||
first create a python venv and install the required packages into it:
|
|
||||||
|
|
||||||
$ python -m venv venv
|
|
||||||
$ source venv/bin/activate
|
|
||||||
$ pip install -r requirements.txt
|
|
||||||
|
|
||||||
Then run LLavaTagger for instance like so:
|
|
||||||
|
|
||||||
$ python LLavaTagger.py --common_description "a image of a cat, " --prompt "describe the cat in 10 to 20 words" --batch 8 --quantize --image_dir ~/cat_images
|
|
||||||
|
|
||||||
By default LLavaTagger will run in parallel on all available gpus, if this is undesriable please use the ROCR_VISIBLE_DEVICES= or CUDA_VISIBLE_DEVICES= environment variable to hide unwanted gpus
|
|
||||||
|
|
||||||
LLavaTagger will then create a meta.jsonl in the image directory sutable to be used by the scripts of [diffusers](https://github.com/huggingface/diffusers) to train stable diffusion (xl) if other formats are desired ../utils contains scripts to transform the metadata into other formats for instace for the use with [kohya](https://github.com/bmaltais/kohya_ss)
|
|
||||||
|
|
||||||
If editing the created tags is desired, [QImageTagger](https://uvos.xyz/git/uvos/QImageTagger) can be used for this purpose
|
|
|
@ -5,7 +5,7 @@ ninja==1.11.1.1
|
||||||
safetensors==0.4.2
|
safetensors==0.4.2
|
||||||
tokenizers==0.15.2
|
tokenizers==0.15.2
|
||||||
transformers
|
transformers
|
||||||
torch
|
pytorch
|
||||||
opencv-python
|
opencv-python
|
||||||
numpy
|
numpy
|
||||||
tqdm
|
tqdm
|
||||||
|
|
|
@ -1,20 +0,0 @@
|
||||||
### PersonDatasetAssembler
|
|
||||||
|
|
||||||
PersonDatasetAssembler is a python script that finds images of a spcific person, specified by a referance image in a directory of images or in a video file. PersonDatasetAssembler supports also raw images.
|
|
||||||
|
|
||||||
## How to use
|
|
||||||
|
|
||||||
first create a python venv and install the required packages into it:
|
|
||||||
|
|
||||||
$ python -m venv venv
|
|
||||||
$ source venv/bin/activate
|
|
||||||
$ pip install -r requirements.txt
|
|
||||||
|
|
||||||
Then run PersonDatasetAssembler for instance like so:
|
|
||||||
|
|
||||||
$ python PersonDatasetAssembler.py --referance someperson.jpg --match_model ../Weights/face_recognition_sface_2021dec.onnx --detect_model ../Weights/face_detection_yunet_2023mar.onnx --input ~/Photos --out imagesOfSomePerson
|
|
||||||
|
|
||||||
Or to extract images from a video:
|
|
||||||
|
|
||||||
$ python PersonDatasetAssembler.py --referance someperson.jpg --match_model ../Weights/face_recognition_sface_2021dec.onnx --detect_model ../Weights/face_detection_yunet_2023mar.onnx -i ~/SomeVideo.mkv --out imagesOfSomePerson
|
|
||||||
|
|
|
@ -1,4 +0,0 @@
|
||||||
numpy==1.26.4
|
|
||||||
opencv-python==4.10.0.82
|
|
||||||
tqdm==4.66.4
|
|
||||||
Wand==0.6.13
|
|
35
README.md
35
README.md
|
@ -1,35 +0,0 @@
|
||||||
# SDImagePreprocess
|
|
||||||
|
|
||||||
This repo contains a collection of high performance tools intended to ease the createion of datasets for image generation AI training like stable diffusion.
|
|
||||||
|
|
||||||
## Included tools
|
|
||||||
|
|
||||||
This repo contains the following tools:
|
|
||||||
|
|
||||||
### SmartCrop
|
|
||||||
|
|
||||||
SmartCrop is an application that uses content aware croping using, [seam carving](https://en.wikipedia.org/wiki/Seam_carving) and resizeing to bring a directory of images into the deisred size and aspect ratio for training. SmartCrop ist configurable to prioritize specific items or specifc persons in the images provided.
|
|
||||||
|
|
||||||
#### Content detected in image:
|
|
||||||
|
|
||||||

|
|
||||||
|
|
||||||
#### Cropped image based on content:
|
|
||||||

|
|
||||||
|
|
||||||
### PersonDatasetAssembler
|
|
||||||
|
|
||||||
PersonDatasetAssembler is a python script that finds images of a spcific person, specified by a referance image in a directory of images or in a video file. PersonDatasetAssembler supports also raw images.
|
|
||||||
|
|
||||||
### LLavaTagger
|
|
||||||
|
|
||||||
LLavaTagger is a python script that tags images based on a given prompt using the [LLaVA](https://llava-vl.github.io/) multi modal llm. LLavaTagger supports using any number of gpus in ddp parralel for this task.
|
|
||||||
|
|
||||||
### DanbooruTagger
|
|
||||||
|
|
||||||
DanbooruTagger is a python script of dubious utility that tags images based using the [DeepDanbooru](https://github.com/KichangKim/DeepDanbooru) convolutional network.
|
|
||||||
|
|
||||||
|
|
||||||
## License
|
|
||||||
|
|
||||||
All files in this repo are litcenced GPL V3, see LICENSE
|
|
|
@ -1,50 +0,0 @@
|
||||||
# SmartCrop
|
|
||||||
|
|
||||||
SmartCrop is an application that uses content aware croping using, [seam carving](https://en.wikipedia.org/wiki/Seam_carving) and resizeing to bring a directory of images into the deisred size and aspect ratio for training. SmartCrop ist configurable to prioritize specific items or specifc persons in the images provided.
|
|
||||||
|
|
||||||
## Requirements
|
|
||||||
|
|
||||||
* [cmake](https://cmake.org/) 3.6 or later
|
|
||||||
* [opencv](https://opencv.org/) 4.8 or later
|
|
||||||
* A c++17 capable compiler and standard lib like gcc or llvm/clang
|
|
||||||
* git is required to get the source
|
|
||||||
|
|
||||||
## Building
|
|
||||||
|
|
||||||
The steps to build this application are:
|
|
||||||
|
|
||||||
$ git clone https://uvos.xyz/git/uvos/SDImagePreprocess.git
|
|
||||||
$ cd SDImagePreprocess
|
|
||||||
$ mkdir build
|
|
||||||
$ cmake ..
|
|
||||||
$ make
|
|
||||||
|
|
||||||
The binary can then be found in build/SmartCrop and can optionaly be installed with:
|
|
||||||
|
|
||||||
$ sudo make install
|
|
||||||
|
|
||||||
## Basic usage
|
|
||||||
|
|
||||||
To process all images in the directory ~/images and output the images into ~/proceesedImages:
|
|
||||||
|
|
||||||
$ smartcrop --out processedImages ~/images/*
|
|
||||||
|
|
||||||
To also focus on the person in the image ~/person.jpg
|
|
||||||
|
|
||||||
$ smartcrop --out processedImages --focus-person ~/person.jpg ~/images/*
|
|
||||||
|
|
||||||
To also enable seam carving
|
|
||||||
|
|
||||||
$ smartcrop --out processedImages --focus-person ~/person.jpg --seam-carving ~/images/*
|
|
||||||
|
|
||||||
see smartcrop --help for more
|
|
||||||
|
|
||||||
## Example
|
|
||||||
|
|
||||||
#### Content detected in image:
|
|
||||||

|
|
||||||
|
|
||||||
#### Cropped image based on content:
|
|
||||||

|
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue