issue with (Meta_Synthetic_Data_Llama3_2_(3B).ipynb) #39

ramixpe · 2025-05-05T15:15:10Z

when opening for more than 3 chunks:
import time

Process 3 chunks for now -> can increase but slower!

for filename in filenames[:5]:
!synthetic-data-kit
-c synthetic_data_kit_config.yaml
create {filename}
--num-pairs 25
--type "qa"
time.sleep(2) # Sleep some time to leave some room for processing

looks like vllm stop responding, or some timeout?!

cell logs:
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
[2KProcessing 7 chunks to generate QA pairs...output/unoc_document_0.txt.....
[2KBatch processing complete.ontent from data/output/unoc_document_0.txt...
[2KGenerated 26 QA pairs totalt from data/output/unoc_document_0.txt...
[2KSaving result to data/generated/unoc_document_0_qa_pairs.json.txt...
[2KSuccessfully wrote test file to data/generated/test_write.jsontxt...
[2KSuccessfully wrote result to data/generated/unoc_document_0_qa_pairs.json
[2K[32m⠴[0m Generating qa content from data/output/unoc_document_0.txt...
[1A[2K[32m Content saved to [0m[1;32mdata/generated/unoc_document_0_qa_pairs.json[0m
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
[2KProcessing 4 chunks to generate QA pairs...output/unoc_document_1.txt.....
[2KBatch processing complete.ontent from data/output/unoc_document_1.txt...
[2KGenerated 24 QA pairs totalt from data/output/unoc_document_1.txt...
[2KSaving result to data/generated/unoc_document_1_qa_pairs.json.txt...
[2KSuccessfully wrote test file to data/generated/test_write.jsontxt...
[2KSuccessfully wrote result to data/generated/unoc_document_1_qa_pairs.json
[2K[32m⠋[0m Generating qa content from data/output/unoc_document_1.txt...
[1A[2K[32m Content saved to [0m[1;32mdata/generated/unoc_document_1_qa_pairs.json[0m
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
[2KProcessing 5 chunks to generate QA pairs...output/unoc_document_2.txt.....
[2KBatch processing complete.ontent from data/output/unoc_document_2.txt...
[2KGenerated 0 QA pairs totalnt from data/output/unoc_document_2.txt...
[2KSaving result to data/generated/unoc_document_2_qa_pairs.json.txt...
[2KSuccessfully wrote test file to data/generated/test_write.jsontxt...
[2KSuccessfully wrote result to data/generated/unoc_document_2_qa_pairs.json
[2K[32m⠧[0m Generating qa content from data/output/unoc_document_2.txt...
[1A[2K[32m Content saved to [0m[1;32mdata/generated/unoc_document_2_qa_pairs.json[0m
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
[31mL Error: VLLM server not available at [0m[4;94mhttp://localhost:8000/v1[0m
[33mPlease start the VLLM server with:[0m
[1;34mvllm serve unsloth/Llama-[0m[1;36m3.2[0m[1;34m-3B-Instruct[0m
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
[31mL Error: VLLM server not available at [0m[4;94mhttp://localhost:8000/v1[0m
[33mPlease start the VLLM server with:[0m
[1;34mvllm serve unsloth/Llama-[0m[1;36m3.2[0m[1;34m-3B-Instruct[0m

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

issue with (Meta_Synthetic_Data_Llama3_2_(3B).ipynb) #39

issue with (Meta_Synthetic_Data_Llama3_2_(3B).ipynb) #39

ramixpe commented May 5, 2025

issue with (Meta_Synthetic_Data_Llama3_2_(3B).ipynb) #39

issue with (Meta_Synthetic_Data_Llama3_2_(3B).ipynb) #39

Comments

ramixpe commented May 5, 2025

Process 3 chunks for now -> can increase but slower!