This project is a classification and analysis of unstructured data and also has the power to classify the different type of data like text, jpg, pdf, doc, png, py, c, c++. java, exe and many more from a single folder.
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
Classification & Analysis of Unstructured Data
1. Classification and analysis of unstructured data
Project Mentor :-
Dr. Navanath Saharia
Radhe Raman Tiwari
Roll No :- 17010115
Aviral Prakash
Roll No :- 17010122
4. Schema:-
MP3 FILES
ID
File Name
File Creation Date/Time
FileAccess Date/Time
File Modify Date/Time
FileType
DOCX FILES
ID
File Name
File Creation Date/Time
FileAccess Date/Time
File Modify Date/Time
FileType
CPP FILES
ID
File Name
File Creation Date/Time
File Access Date/Time
File Modify Date/Time
FileType
Examples:-
Format:-
#FILETYPE FILES
ID
File Name
File Creation Date/Time
File Access Date/Time
File Modify Date/Time
FileType
5. Data Dictionary :-
Column DataType Description
_id String Object Id NOT NULL
File Name String Name of File NOT
NULL
FileType String Type of File NOT NULL
Creation Timestamp FileCreation
Date/Time NOT NULL
Access Timestamp Last FileAccess
Date/Time NOT NULL
Modify Timestamp Last File Modify
Date/Time NOT NULL
Column DataType Description
_id String Object Id NOT NULL
filename String Name of File NOT
NULL
filetype String Type of File NOT NULL
fileage Timestamp FileCreation
Date/Time NOT NULL
lastaccess Timestamp Last FileAccess
Date/Time NOT NULL
lastmodified Timestamp Last File Modify
Date/Time NOT NULL
FOR WINDOWS TOOL: FOR UBUNTU TOOL:
6. Description of the modules :WindowsVersion
FileData() :- It takes path of directory and return all file form directory as list.
MetaData() :- It takes name with path of file and return metadata of file as list.
FileExtension() :- It takes file name and return their extension.
MongoData() :- It takes metadata of file and return only needed information as list.
MongoConnect() :- It take only needed information of file and store it to Mongodb.
SetDataSet() :- It takes needed information of file and returns it to JSON format.
StringSplit() :- It takes information, split and return according to requirement.
Main() :- It hendles all modules based on requirement.
Caller() :- It calls all modules according to program logics
7. Description of the modules : Ubuntu version
abcd() :- It takes the path of a directory and extracts the metadata of all the files under it, stores it in a list and
returns that list.
sort() :- It takes a list and sorts the metadata in it according to the filetype and returns the sorted list.
MongoConnect() :- It takes metadata of a file in dictionary form and stores it in mongoDB.
SetDataSet() :- It takes the list having metadata and converts metadata of each file in list to a dictionary, for later
storing it in mongoDB and returns back the original list.
__init__() :- Defining variable and list globally.
setupUi() :- It deals with the designing and functioning of the UI of tool including buttons , search bar and output
window.
retranslateUi() :- It deals with the setting the window title, tool icon, tool name and button names.
GetterType() :-It invokes abcd() function and gives metadata of files in sorted order of their filetype.
GetterCreate() :-It invokes abcd() function and gives metadata of files in sorted order of their date of creation.
GetterModify() :-It invokes abcd() function and gives metadata of files in sorted order of their last modified date.
GetterAccess() :-It invokes abcd() function and gives metadata of files in sorted order of their last accessed date.
Caller() :- It calls all modules according to program logics.
8. Declaration:
The content (such as description, source code, and diagram)
presented and submitted to the instructor by Radhe RamanTiwari,
roll no.-17010115 and Aviral Prakash, roll no.-17010122 is our own
creation (except system library/procedure). If anything found
plagiarised, I Radhe RamanTiwari and I Aviral Prakash, will accept
zero marks against the submitted project.We are allowing also, the
academic section, IIIT Senapati, Manipur to deduct ten marks from
our final score of CS 240 or CS 241 course as punishment.