\n",
" "
]
},
"metadata": {},
"execution_count": 81
}
],
"source": [
"df1.head()"
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {
"id": "bUCjN6UVeERH"
},
"outputs": [],
"source": [
"df1.to_json('qa_sample.jsonl', orient='records', lines=True)"
]
},
{
"cell_type": "code",
"execution_count": 83,
"metadata": {
"id": "5pNltIc1eKpP"
},
"outputs": [],
"source": [
"import os\n",
"os.environ['OPENAI_API_KEY'] = \"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\""
]
},
{
"cell_type": "code",
"execution_count": 84,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "qMKRiq8jeVN6",
"outputId": "68747b92-83c8-47c2-a7f5-690c0475df48"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Analyzing...\n",
"\n",
"- Your file contains 20 prompt-completion pairs. In general, we recommend having at least a few hundred examples. We've found that performance tends to linearly increase for every doubling of the number of examples\n",
"- Your data does not contain a common separator at the end of your prompts. Having a separator string appended to the end of the prompt makes it clearer to the fine-tuned model where the completion should begin. See https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset for more detail and examples. If you intend to do open-ended generation, then you should leave the prompts empty\n",
"- All prompts start with prefix `Question: `\n",
"- Your data does not contain a common ending at the end of your completions. Having a common ending string appended to the end of the completion makes it clearer to the fine-tuned model where the completion should end. See https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset for more detail and examples.\n",
"- The completion should start with a whitespace character (` `). This tends to produce better results due to the tokenization we use. See https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset for more details\n",
"\n",
"Based on the analysis we will perform the following actions:\n",
"- [Recommended] Add a suffix separator `\\n\\n###\\n\\n` to all prompts [Y/n]: Y\n",
"- [Recommended] Add a suffix ending `\\n` to all completions [Y/n]: Y\n",
"- [Recommended] Add a whitespace character to the beginning of the completion [Y/n]: Y\n",
"\n",
"\n",
"Your data will be written to a new JSONL file. Proceed [Y/n]: Y\n",
"\n",
"Wrote modified file to `qa_sample_prepared.jsonl`\n",
"Feel free to take a look!\n",
"\n",
"Now use that file when fine-tuning:\n",
"> openai api fine_tunes.create -t \"qa_sample_prepared.jsonl\"\n",
"\n",
"After you’ve fine-tuned a model, remember that your prompt has to end with the indicator string `\\n\\n###\\n\\n` for the model to start generating completions, rather than continuing with the prompt. Make sure to include `stop=[\"\\n\"]` so that the generated texts ends at the expected place.\n",
"Once your model starts training, it'll approximately take 2.72 minutes to train a `curie` model, and less for `ada` and `babbage`. Queue will approximately take half an hour per job ahead of you.\n"
]
}
],
"source": [
"!openai tools fine_tunes.prepare_data -f qa_sample.jsonl"
]
},
{
"cell_type": "code",
"execution_count": 85,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "rB6S552FeNmn",
"outputId": "df95cacd-e10e-4f12-fd7d-b93047e03ed8"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"\rUpload progress: 0% 0.00/1.72k [00:00, ?it/s]\rUpload progress: 100% 1.72k/1.72k [00:00<00:00, 2.53Mit/s]\n",
"Uploaded file from qa_sample_prepared.jsonl: file-EJzP9CiMe5CeL9ub20pEVPDr\n",
"Created fine-tune: ft-rtDJX5c4gZGQsathmhgkAdMZ\n",
"Streaming events until fine-tuning is complete...\n",
"\n",
"(Ctrl-C will interrupt the stream, but not cancel the fine-tune)\n",
"[2023-03-23 18:29:02] Created fine-tune: ft-rtDJX5c4gZGQsathmhgkAdMZ\n",
"\n",
"Stream interrupted (client disconnected).\n",
"To resume the stream, run:\n",
"\n",
" openai api fine_tunes.follow -i ft-rtDJX5c4gZGQsathmhgkAdMZ\n",
"\n"
]
}
],
"source": [
"!openai api fine_tunes.create -t 'qa_sample_prepared.jsonl' -m 'davinci'"
]
},
{
"cell_type": "code",
"execution_count": 94,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Sav-4eTkekpn",
"outputId": "4178cdcf-c0d6-436d-90e0-6caf82cb37cb"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"[2023-03-23 18:29:02] Created fine-tune: ft-rtDJX5c4gZGQsathmhgkAdMZ\n",
"[2023-03-23 18:32:38] Fine-tune costs $0.05\n",
"[2023-03-23 18:32:38] Fine-tune enqueued. Queue number: 5\n",
"[2023-03-23 18:33:28] Fine-tune is in the queue. Queue number: 4\n",
"[2023-03-23 18:33:46] Fine-tune is in the queue. Queue number: 3\n",
"[2023-03-23 18:34:34] Fine-tune is in the queue. Queue number: 2\n",
"[2023-03-23 18:36:03] Fine-tune is in the queue. Queue number: 1\n",
"[2023-03-23 18:36:46] Fine-tune is in the queue. Queue number: 0\n",
"[2023-03-23 18:36:52] Fine-tune started\n",
"[2023-03-23 18:39:01] Completed epoch 1/4\n",
"[2023-03-23 18:39:07] Completed epoch 2/4\n",
"[2023-03-23 18:39:12] Completed epoch 3/4\n",
"[2023-03-23 18:39:18] Completed epoch 4/4\n",
"[2023-03-23 18:39:56] Uploaded model: davinci:ft-global-corporate-holdings-2023-03-23-18-39-55\n",
"[2023-03-23 18:39:57] Uploaded result file: file-my0JOmf92CH1KdwI9SG6czp6\n",
"[2023-03-23 18:39:57] Fine-tune succeeded\n",
"\n",
"Job complete! Status: succeeded 🎉\n",
"Try out your fine-tuned model:\n",
"\n",
"openai api completions.create -m davinci:ft-global-corporate-holdings-2023-03-23-18-39-55 -p \n"
]
}
],
"source": [
"!openai api fine_tunes.follow -i ft-rtDJX5c4gZGQsathmhgkAdMZ"
]
},
{
"cell_type": "code",
"execution_count": 97,
"metadata": {
"id": "pwdFktBKe_On",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "3a817df0-479b-430b-995b-a184878aff87"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Answer the question based on DataSet Q: (White-Floral, Sweet) \n",
" Answer: Moscato d'Asti\n",
"\n",
"Q:\n"
]
}
],
"source": [
"!openai api completions.create -m davinci:ft-global-corporate-holdings-2023-03-23-18-39-55 -p \"Answer the question based on DataSet Q: (White-Floral, Sweet)\""
]
},
{
"cell_type": "code",
"source": [
"import openai\n",
"def get_answers_api(question):\n",
" try:\n",
" response = openai.Completion.create(\n",
" api_key = \"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\",\n",
" engine=\"davinci:ft-global-corporate-holdings-2023-03-23-18-39-55\",\n",
" prompt=f\"Answer the question based on DataSet Q:\\n{question}\\n\\nAnswer:\",\n",
" temperature=0,\n",
" max_tokens=900,\n",
" top_p=1,\n",
" frequency_penalty=0,\n",
" presence_penalty=0\n",
" )\n",
" return response['choices'][0]['text']\n",
" except Exception as e:\n",
" print (e)\n",
" return \"\"\n",
"\n",
"print(get_answers_api('White-Floral, Sweet'))"
],
"metadata": {
"id": "FOpiYqihxjUo",
"outputId": "87e7b8a2-c785-492c-b844-9535036aa9bc",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": 99,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n",
"\n",
"The correct answer is:\n",
"\n",
"\"Chardonnay\"\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"import openai\n",
"openai.api_key = \"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\"\n",
"response = openai.ChatCompletion.create(\n",
" model=\"gpt-3.5-turbo\",\n",
" messages= [{\n",
" \"role\": \"system\",\n",
" \"content\": \"You are a assistant\"\n",
" },\n",
"\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": \"Can you recommend wine based on my taste?\"\n",
" },\n",
"\n",
" {\n",
" \"content\": \"Of course! Please provide me with some details about your wine, such as Preference(Red,White,Red-Fruity,Red-Earthy,White-Crisp,Red-Spicy,Red-Rich), Red_Wine characteristics(Light-Bodied,Full-Bodied,Dry,Sweet,None), White_Wine Characteristics(Dry,Sweet,None) and the recommended wine such as (Pinot Noir,Gewurztraminer,Sauvignon Blanc,Shiraz or Zinfandel,Chianti,Chardonnay,Cabernet Sauvignon,Riesling).\",\n",
" \"role\": \"assistant\"\n",
" }, \n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": \"(Red,Light-Bodied,None)\"\n",
" }\n",
"\n",
" \n",
" ]\n",
" )\n",
"print(response)"
],
"metadata": {
"id": "nbEDdjmgyNLE",
"outputId": "c5b79e79-346d-4bde-b750-98e2cb58c813",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": 101,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"{\n",
" \"choices\": [\n",
" {\n",
" \"finish_reason\": \"stop\",\n",
" \"index\": 0,\n",
" \"message\": {\n",
" \"content\": \"Based on your preferences, I would suggest trying a Pinot Noir from Oregon or Burgundy. Both regions are known for producing Pinot Noir wines that are light-bodied with delicate fruit flavors, and have a lower tannin profile. Some specific recommendations include:\\n\\n- Erath Pinot Noir (Oregon)\\n- Domaine William F\\u00e8vre Chablis \\\"Le Domaine\\\" Pinot Noir (Burgundy, France)\\n- Joseph Drouhin LaFor\\u00eat Bourgogne Pinot Noir (Burgundy, France)\",\n",
" \"role\": \"assistant\"\n",
" }\n",
" }\n",
" ],\n",
" \"created\": 1679597713,\n",
" \"id\": \"chatcmpl-6xKNdM2Ye333AhWnzhF8hwOLTSjhT\",\n",
" \"model\": \"gpt-3.5-turbo-0301\",\n",
" \"object\": \"chat.completion\",\n",
" \"usage\": {\n",
" \"completion_tokens\": 108,\n",
" \"prompt_tokens\": 164,\n",
" \"total_tokens\": 272\n",
" }\n",
"}\n"
]
}
]
},
{
"cell_type": "code",
"source": [],
"metadata": {
"id": "F_H-PxFL0lzw"
},
"execution_count": null,
"outputs": []
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
},
"accelerator": "GPU",
"gpuClass": "standard"
},
"nbformat": 4,
"nbformat_minor": 0
}