add readmes

PersonDatasetAssembler: add requirements.txt
LLavaTagger: correct requirements
2024-06-14 00:09:14 +02:00 · 2024-06-14 00:08:34 +02:00 · 2024-06-13 23:52:52 +02:00
6 changed files with 131 additions and 1 deletions
--- a/LLavaTagger/README.md
+++ b/LLavaTagger/README.md
@ -0,0 +1,21 @@
+# LLavaTagger
+
+LLavaTagger is a python script that tags images based on a given prompt using the [LLaVA](https://llava-vl.github.io/) multi modal llm. LLavaTagger supports using any number of gpus in ddp parralel for this task.
+
+## How to use
+
+first create a python venv and install the required packages into it:
+
+	$ python -m venv venv
+	$ source venv/bin/activate
+	$ pip install -r requirements.txt
+
+Then run LLavaTagger for instance like so:
+
+	$ python LLavaTagger.py --common_description "a image of a cat, " --prompt "describe the cat in 10 to 20 words" --batch 8 --quantize --image_dir ~/cat_images
+
+By default LLavaTagger will run in parallel on all available gpus, if this is undesriable please use the ROCR_VISIBLE_DEVICES= or CUDA_VISIBLE_DEVICES= environment variable to hide unwanted gpus
+
+LLavaTagger will then create a meta.jsonl in the image directory sutable to be used by the scripts of [diffusers](https://github.com/huggingface/diffusers) to train stable diffusion (xl) if other formats are desired ../utils contains scripts to transform the metadata into other formats for instace for the use with [kohya](https://github.com/bmaltais/kohya_ss)
+
+If editing the created tags is desired, [QImageTagger](https://uvos.xyz/git/uvos/QImageTagger) can be used for this purpose
--- a/LLavaTagger/requirements.txt
+++ b/LLavaTagger/requirements.txt
@ -5,7 +5,7 @@ ninja==1.11.1.1
 safetensors==0.4.2
 tokenizers==0.15.2
 transformers
-pytorch
+torch
 opencv-python
 numpy
 tqdm
--- a/PersonDatasetAssembler/README.md
+++ b/PersonDatasetAssembler/README.md
@ -0,0 +1,20 @@
+### PersonDatasetAssembler
+
+PersonDatasetAssembler is a python script that finds images of a spcific person, specified by a referance image in a directory of images or in a video file. PersonDatasetAssembler supports also raw images.
+
+## How to use
+
+first create a python venv and install the required packages into it:
+
+	$ python -m venv venv
+	$ source venv/bin/activate
+	$ pip install -r requirements.txt
+
+Then run PersonDatasetAssembler for instance like so:
+
+	$ python PersonDatasetAssembler.py --referance someperson.jpg --match_model ../Weights/face_recognition_sface_2021dec.onnx --detect_model ../Weights/face_detection_yunet_2023mar.onnx --input ~/Photos --out imagesOfSomePerson
+
+Or to extract images from a video:
+
+	$ python PersonDatasetAssembler.py --referance someperson.jpg --match_model ../Weights/face_recognition_sface_2021dec.onnx --detect_model ../Weights/face_detection_yunet_2023mar.onnx -i ~/SomeVideo.mkv --out imagesOfSomePerson
+
--- a/PersonDatasetAssembler/requirements.txt
+++ b/PersonDatasetAssembler/requirements.txt
@ -0,0 +1,4 @@
+numpy==1.26.4
+opencv-python==4.10.0.82
+tqdm==4.66.4
+Wand==0.6.13
--- a/README.md
+++ b/README.md
@ -0,0 +1,35 @@
+# SDImagePreprocess
+
+This repo contains a collection of high performance tools intended to ease the createion of datasets for image generation AI training like stable diffusion.
+
+## Included tools
+
+This repo contains the following tools:
+
+### SmartCrop
+
+SmartCrop is an application that uses content aware croping using, [seam carving](https://en.wikipedia.org/wiki/Seam_carving) and resizeing to bring a directory of images into the deisred size and aspect ratio for training. SmartCrop ist configurable to prioritize specific items or specifc persons in the images provided.
+
+#### Content detected in image:
+
+![Content found in image](SmartCrop/images/IMGP3692.jpg)
+
+#### Cropped image based on content:
+![Cropped image](SmartCrop/images/IMGP3692C.jpg)
+
+### PersonDatasetAssembler
+
+PersonDatasetAssembler is a python script that finds images of a spcific person, specified by a referance image in a directory of images or in a video file. PersonDatasetAssembler supports also raw images.
+
+### LLavaTagger
+
+LLavaTagger is a python script that tags images based on a given prompt using the [LLaVA](https://llava-vl.github.io/) multi modal llm. LLavaTagger supports using any number of gpus in ddp parralel for this task.
+
+### DanbooruTagger
+
+DanbooruTagger is a python script of dubious utility that tags images based using the [DeepDanbooru](https://github.com/KichangKim/DeepDanbooru) convolutional network.
+
+
+## License
+
+All files in this repo are litcenced GPL V3, see LICENSE
--- a/SmartCrop/README.md
+++ b/SmartCrop/README.md
@ -0,0 +1,50 @@
+# SmartCrop
+
+SmartCrop is an application that uses content aware croping using, [seam carving](https://en.wikipedia.org/wiki/Seam_carving) and resizeing to bring a directory of images into the deisred size and aspect ratio for training. SmartCrop ist configurable to prioritize specific items or specifc persons in the images provided.
+
+## Requirements
+
+* [cmake](https://cmake.org/) 3.6 or later
+* [opencv](https://opencv.org/) 4.8 or later
+* A c++17 capable compiler and standard lib like gcc or llvm/clang
+* git is required to get the source
+
+## Building
+
+The steps to build this application are:
+
+	$ git clone https://uvos.xyz/git/uvos/SDImagePreprocess.git
+	$ cd SDImagePreprocess
+	$ mkdir build
+	$ cmake ..
+	$ make
+
+The binary can then be found in build/SmartCrop and can optionaly be installed with:
+
+	$ sudo make install
+
+## Basic usage
+
+To process all images in the directory ~/images and output the images into ~/proceesedImages:
+
+	$ smartcrop --out processedImages ~/images/*
+
+To also focus on the person in the image ~/person.jpg
+
+	$ smartcrop --out processedImages --focus-person ~/person.jpg ~/images/*
+
+To also enable seam carving
+
+	$ smartcrop --out processedImages --focus-person ~/person.jpg --seam-carving ~/images/*
+
+see smartcrop --help for more
+
+## Example
+
+#### Content detected in image:
+![Content found in image](images/IMGP3692.jpg)
+
+#### Cropped image based on content:
+![Cropped image](images/IMGP3692C.jpg)
+
+
Author	SHA1	Message	Date
uvos	749e22335e	add readmes	2024-06-14 00:09:14 +02:00
uvos	8c4c56b5fe	PersonDatasetAssembler: add requirements.txt	2024-06-14 00:08:34 +02:00
uvos	42e32a6bb3	LLavaTagger: correct requirements	2024-06-13 23:52:52 +02:00