397 lines
12 KiB
Plaintext
397 lines
12 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Libs import"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain_community.embeddings import HuggingFaceBgeEmbeddings\n",
|
|
"from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
|
|
"from langchain_community.vectorstores import FAISS\n",
|
|
"from langchain_community.document_loaders import PyPDFLoader"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Loading the embeddings model"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"c:\\Users\\timmy_3aupohg\\anaconda3\\envs\\smog_env\\Lib\\site-packages\\sentence_transformers\\cross_encoder\\CrossEncoder.py:11: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console)\n",
|
|
" from tqdm.autonotebook import tqdm, trange\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Initialize embedding\n",
|
|
"model_name = \"BAAI/bge-small-en\"\n",
|
|
"model_kwargs = {\"device\": \"cuda\"} #can also be cpu\n",
|
|
"encode_kwargs = {\"normalize_embeddings\": True}\n",
|
|
"embeddings = HuggingFaceBgeEmbeddings(\n",
|
|
" model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs\n",
|
|
" )"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Experiment for pdf loading"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# creating a function that checks the document type and loads the document\n",
|
|
"def load_pdf_document(document_path):\n",
|
|
" if document_path.endswith(\".pdf\"):\n",
|
|
" pdf_doc = PyPDFLoader(document_path)\n",
|
|
" pages = pdf_doc.load_and_split()\n",
|
|
" return pages\n",
|
|
" else:\n",
|
|
" raise ValueError(f\"Unsupported document type for {document_path}\")\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Load the document \n",
|
|
"document_path = \"data/corolla-2020-toyota-owners-manual.pdf\"\n",
|
|
"pdf_pages = load_pdf_document(document_path)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"db = FAISS.from_documents(pdf_pages, embeddings)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 10,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def save_embedded_data(embeddings, key=\"pdf\"):\n",
|
|
" embeddings.save_local(f\"vec-db/index/faiss_index_{key}\")\n",
|
|
" print(\"Embeddings saved\")\n",
|
|
"\n",
|
|
"def load_embedded_data(embeddings, key=\"pdf\"):\n",
|
|
" embed_db = FAISS.load_local(f\"vec-db/index/faiss_index_{key}\", embeddings, allow_dangerous_deserialization=True)\n",
|
|
" return embed_db"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Embeddings saved\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"save_embedded_data(db)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 12,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"load_db = load_embedded_data(embeddings)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Data Search"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 24,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"query = \"Steering assist function/lane centering function\"\n",
|
|
"docs = load_db.similarity_search(query)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"202 4-5. Using the driving support systems\n",
|
|
"COROLLA_UInside of displayed lines is \n",
|
|
"black\n",
|
|
"Indicates that the system is not able to recognize white (yellow) \n",
|
|
"lines or a course\n",
|
|
"* or is temporar-\n",
|
|
"ily canceled.\n",
|
|
"*: Boundary between asphalt and \n",
|
|
"the side of the road, such as \n",
|
|
"grass, soil, or a curb\n",
|
|
"Follow-up cruising display\n",
|
|
"Displayed when the multi-informa-tion display is switched to the driv-ing support system information screen.\n",
|
|
"Indicates that steering assist of the \n",
|
|
"lane centering function is operating by monitoring the position of a pre-ceding vehicle.\n",
|
|
"When the follow-up cruising display \n",
|
|
"is displayed, if the preceding vehi-cle moves, your vehicle may move in the same way. A lways pay care-\n",
|
|
"ful attention to your surroundings and operate the steering wheel as necessary to correct the path of the vehicle and ensure safety.\n",
|
|
"■Operation conditions of each \n",
|
|
"function\n",
|
|
"●Lane departure alert function\n",
|
|
"This function oper ates when all of \n",
|
|
"the following cond itions are met.\n",
|
|
"• LTA is turned on.• Vehicle speed is approximately 32 \n",
|
|
"mph (50 km/h) or more.*1\n",
|
|
"• System recognizes white (yellow) \n",
|
|
"lane lines or a course*2. (When a \n",
|
|
"white [yellow] line or course*2 is \n",
|
|
"recognized on only one side, the system will operate only for the \n",
|
|
"recognized side.)\n",
|
|
"• Width of traffic lane is approxi-\n",
|
|
"mately 9.8 ft. (3 m) or more.\n",
|
|
"• Turn signal lever is not operated.\n",
|
|
"(Vehicles with a Blind Spot Moni-\n",
|
|
"tor: Except when another vehicle \n",
|
|
"is in the lane on the side where the turn signal was operated)\n",
|
|
"• Vehicle is not being driven around \n",
|
|
"a sharp curve.\n",
|
|
"• No system malfunctions are \n",
|
|
"detected. ( P.204)\n",
|
|
"*1:The function oper ates even if the \n",
|
|
"vehicle speed is less than \n",
|
|
"approximately 32 mph (50 km/h) when the lane centering function is operating.\n",
|
|
"*2:Boundary between asphalt and \n",
|
|
"the side of the road, such as grass, soil, or a curb\n",
|
|
"●Steering assist function\n",
|
|
"This function operates when all of the following conditions are met in addition to the operation conditions for the lane departure alert function.\n",
|
|
"• Setting for “Steering Assist” in \n",
|
|
"of the multi-information display is \n",
|
|
"set to “ON”. ( P.548)\n",
|
|
"• Vehicle is not accelerated or \n",
|
|
"decelerated by a fixed amount or more.\n",
|
|
"• Steering wheel is not operated \n",
|
|
"with a steering force level suitable \n",
|
|
"for changing lanes.\n",
|
|
"• ABS, VSC, TRAC and PCS are \n",
|
|
"not operating.\n",
|
|
"• TRAC or VSC is not turned off.\n",
|
|
"• Hands off steering wheel warning \n",
|
|
"is not displayed. ( P.204)\n",
|
|
"●Vehicle sway warning function\n",
|
|
"This function operates when all of \n",
|
|
"https://www.MyCarManual.com\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"print(docs[0].page_content)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 26,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"201\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"print(docs[0].metadata['page'])"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 15,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def search(db, query, k=4):\n",
|
|
" docs = db.similarity_search(query, k)\n",
|
|
" all = \"\"\n",
|
|
" pages = []\n",
|
|
" for doc in docs:\n",
|
|
" all += f\"{doc.page_content}\\n\"\n",
|
|
" pages.append(doc.metadata['page'])\n",
|
|
" return docs[0].page_content, all, pages"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 16,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"206 4-5. Using the driving support systems\n",
|
|
"COROLLA_UWARNING\n",
|
|
"■Before using LDA system\n",
|
|
"●Do not rely solely upon the LDA \n",
|
|
"system. The LDA system does \n",
|
|
"not automatically drive the vehi-cle or reduce the amount of \n",
|
|
"attention that must be paid to \n",
|
|
"the area in front of the vehicle. The driver must always assume \n",
|
|
"full responsibilit y for driving \n",
|
|
"safely by paying careful atten-\n",
|
|
"tion to the surrounding condi-tions and operating the steering \n",
|
|
"wheel to correct the path of the \n",
|
|
"vehicle. Also, the driver must take adequate breaks when \n",
|
|
"fatigued, such as from driving \n",
|
|
"for a long period of time.\n",
|
|
"●Failure to perform appropriate \n",
|
|
"driving operations and pay care-\n",
|
|
"ful attention may lead to an \n",
|
|
"accident, resulting in death or serious injury.\n",
|
|
"●When not using the LDA sys-\n",
|
|
"tem, use the LDA switch to turn \n",
|
|
"the system off.\n",
|
|
"■Situations unsuitable for LDA system\n",
|
|
"In the following situations, use the LDA switch to turn the system off. \n",
|
|
"Failure to do so may lead to an \n",
|
|
"accident, resulting in death or serious injury.\n",
|
|
"●Vehicle is driven on a road sur-\n",
|
|
"face which is slippery due to \n",
|
|
"rainy weather, fallen snow, freezing, etc.\n",
|
|
"●Vehicle is driven on a snow-cov-\n",
|
|
"ered road.\n",
|
|
"●White (yellow) lin es are difficult \n",
|
|
"to see due to rain, snow, fog, \n",
|
|
"dust, etc.\n",
|
|
"●A spare tire, tire chains, etc. are \n",
|
|
"equipped.●When the tires have been excessively worn, or when the \n",
|
|
"tire inflation p ressure is low.\n",
|
|
"●When tires of a size other than specified are installed.\n",
|
|
"●Vehicle is driven in traffic lanes \n",
|
|
"other than that highways and \n",
|
|
"freeways.\n",
|
|
"●During emergency towing.\n",
|
|
"■Preventing LDA system mal-functions and operations per-\n",
|
|
"formed by mistake\n",
|
|
"●Do not modify the headlights or place stickers, etc. on the sur-\n",
|
|
"face of the lights.\n",
|
|
"●Do not modify the suspension etc. If the suspension etc. needs \n",
|
|
"to be replaced, contact your \n",
|
|
"Toyota dealer.\n",
|
|
"●Do not install or place anything on the hoo d or grille. Also, do \n",
|
|
"not install a gr ille guard (bull \n",
|
|
"bars, kangaroo bar, etc.).\n",
|
|
"●If your windshield needs repairs, contact your Toyota \n",
|
|
"dealer.\n",
|
|
"■Conditions in which functions \n",
|
|
"may not operate properly\n",
|
|
"In the following situations, the \n",
|
|
"functions may not operate prop-erly and the vehicle may depart \n",
|
|
"from its lane. Drive safely by \n",
|
|
"always paying careful attention to your surroundings and operate \n",
|
|
"the steering whee l to correct the \n",
|
|
"path of the vehicle without relying \n",
|
|
"solely on the functions.\n",
|
|
"●Vehicle is being driven around a sharp curve.\n",
|
|
"https://www.MyCarManual.com\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"search_result, all, pages = search(db, \"What is LDA\")\n",
|
|
"print( search_result )"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 17,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"[205, 208, 204, 212]"
|
|
]
|
|
},
|
|
"execution_count": 17,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"pages"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "ai_index",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.11.9"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
}
|