SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
1
STUDENT MARKS ANANLYSIS
PROJECT REPORT
Submitted in fulfilment for the J Component of ITE2013-Big Data Analytics
Under the guidance of
Prof. Ganesan K
School of Information Technology and Engineering
Fall Semester 2017-18
DONE BY:
NAME REGISTRATION NO
KEDAR KUMAR
15BIT0268
ANURAG
DHYOUNDIYAL
15BIT0157
CERTIFICATE
This is to guarantee that the undertaking work entitled "STUDENT Marks
Analysis" that is being put together by "KEDAR KUMAR (15BIT0268) and
ANURAG DHYONDIYAL (15BIT0157)" is a record of bonafide work done in
Big Data Analytics (ITE2013) under my watch. The substance of this Project
work, in full or in parts, have nor been taken from some other source nor have
been submitted for some other CAL course.
PLACE:VELLORE
DATE:1/11/2017
Kedar kumar(15BIT0268)
Anurag dhyondiyal(15BIT0157)
TABLE OF CONTENTS
CHAPTER NO TITLE PAGE NO
1 Acknowledgement 3
2 Abstract and introduction 4
3 Problem requirement
&Proposed solution
5
4 Hardware and Software
requirements
6
5 Data set 7
6 Code snippet and
algorithm
8-10
7 Code with scala ide using
spark-2.1.0-
binhadoop2.7Hadoop
11-14
-8 Code with java 15-30
9 Refrence 33
3
ACKNOWLEDGEMENTS
We acknowledge Ganesan Sir for the direction and help gave help
the execution of the undertaking. We additionally recognize all
others worried about accomplishment of this undertaking. It is
standard to recognize the University Management/School Dean for
giving us a chance to complete our examinations at the
University.Thanks for such an outstanding opportunity to us.
4
ABSTRACT
CGPA otherwise called Cumulative Grade Points. Average is the normal of Grade Points acquired in every
one of the subjects secured till date. It is trusted that it gives a general knowledge into the level of devotion,
truthfulness and diligent work put by the understudy.
However there might be where an understudy who is remarkable at programming may not appreciate other
hypothetical subjects like programming testing. Notwithstanding, CGPA comes up short when such a
situation comes into picture.
INTRODUCTION
For any school, school or other instructive organization, understudies are an imperative
resource keeping in mind the end goal to deliver alumni of incredible quality who exceed
expectations in scholastics, handy learning, self-improvement and imaginative
considering. To accomplish this it is winds up plainly fundamental for each school, school
or some other instructive establishment to break down the execution of understudies.
Scholarly execution can be measured by leading different examinations, appraisals and
other type of estimations. However scholarly execution may shift from understudy to
understudy as every understudy has distinctive level of execution.
The academic performance of student is usually stored in various formats like files,
documents, records etc. The available data would be analyzed to extract useful
information. It becomes difficult to analyze student data by applying statistical techniques
or other traditional database management tools. Hence there is a need to develop an
automated tool for student performance analysis that would analyze student performance
and will guide them by displaying the areas where they need improvement, in order to
contribute to a student's overall development by generating a score card for the same.
The proposed system will display results of student performance on a single click action
by the user, thus inducing automation and reducing efforts of staff in analyzing student
performance manually
5
PROBLEM STATEMENT
With the gigantic number of understudy deciding on tremendous number of courses and
the different imprints acquired in each course it is hard to finish up criteria in light of which
the organization can choose understudies who have a fitness towards a particular field.
Along these lines the basic criteria of CGPA have been received by the majority of the
enrolling firms.
However CGPA is an exceptionally ambiguous idea as it depends on the imprints got in
every one of the subjects and not particularly the imprints acquired in the subjects required
for the particular enlistment.
We have subsequently endeavored to propose a calculation/technique which can be utilized
to discover understudies who are especially productive in the specific field being considered
for the enlistment.With the huge number of student opting for huge number of courses and
the various marks obtained in each course it is difficult to conclude criteria based on which
the company can select students who have an aptitude towards a specific field. Thus the
common criteria of CGPA have been adopted by most of the recruiting firms.
We have thus tried to propose an algorithm/method which can be used to find students who
are particularly efficient in the particular field being considered for the recruitment.
PROPOSED SOLUTION
We have developed an algorithm using Machine Learning that should prove to be better selection
criteria than CGPA for recruitment to various specific fields. It is discussed below.
6
HARDWARE AND SOFTWARE REQUIREMENTS
The hardware recommendations:
64bit Windows Operating System.
8GB RAM
The software recommendations:
We implemented our code:
 spark-2.1.0-bin-hadoop2.7
 scala ide
 Java programming language.
To get started with spark:
 JDK
 Winutil
7
DATA SET
20,066 rows and 15 columns of data
This Data Set Has Been Collected From
www.kaggle.Com
8
ALGORITHM  CODE SNIPPETS
Algorithm:
1. We store the data in spark RDD.
2. We take the data set and map different subjects to their respective branches.
CSE->DBMS, OS, Data Structures, CAO
Electronics->Control System, ADC, Neural Networks
Civil->Material Science, Construction material, Machine
Drawing, Surveying
Biotechnology->Sustainable development, Microbiology
Mathematics->Statistics, AOD, Linear Algebra
3. Since we have relative marking so we find average marks for each subject which act as
centroid
4. Then we used K-mean clustering algorithm to cluster students. I took the value of Data
Frame at index 0 and 1 as initial centroid. Then found the Euclidean distance between the
marks using distance formula. For further iterations we find the mean of data obtained
after the last iteration to get new centroid value and process it similarly as in first iteration
. This process continued till no more clustering is possible. In this way we got number of
students good in respective subject and stored them in
5. We had an attribute of Semester. It can be used to filter the students.
Students who registered in Fall or Winter semester (F/W) are given priority and those
who registered in Summer and Inter Semester (S/I) are given less priority. This results in
new set of students.
9
6. Convert the Semester to integer value. I assigned 1 to all Fall and winter semester
registered courses and 0 to all Summer and Inter semester courses.
7. Of the clustered students we normalized their marks using Standard Deviation method.
Formula used:
Normalized= original_value-mean (marks)/standard_deviation (marks)
So now we converted the value from large numbers to smaller ones.
8. Now to this value I added 1 for Fall and Winter semester as we are giving priorityto
students who registered their course in Fall or Winter Semester.
9. Now for output we provided two options:
1. To filter students according to single branch like CSE,ECE,CIVIL etc
2. To filter students according to multiple branches we used Apriori Algorithm tofind
which subjects go together like CSE and Mathematics or CSE, ECE and Mathsetc.
10. Sort the obtained Register Number on the basis of marks in descending order.
11. Print the top 10 student Register Numbers as per the requirement criteria.
10
CODE USING SCALA IDE
package com.vit
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.catalyst.optimizer.Optimizer
import org.apache.spark.sql.types.DateType
import org.apache.spark.sql.types.IntegerType
import org.apache.spark.sql.types.LongType
import org.apache.spark.sql.types.StringType
import org.apache.spark.sql.types.StructField
import org.apache.spark.sql.types.StructType
import org.apache.spark.sql.types.DataTypes
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.Row
import org.apache.spark.sql.DataFrame
import org.apache.spark.broadcast.Broadcast
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.Row
import org.apache.spark.sql.DataFrame
import scala.collection.mutable.Map
import java.util.Date
import java.util.Calendar
import org.apache.spark.sql.functions.udf
import org.apache.spark.sql.functions._
import java.util.Date
import com.fico.analytics.tte.utility.TransactionSchema._
import com.fico.analytics.tte.utility.ConfigurationManager
import scala.collection.immutable.HashSet
import org.apache.spark.sql.hive.HiveContext
import org.apache.velocity.runtime.directive.Foreach
import com.fico.analytics.tte.utility.Schema
import collection.immutable.ListMap
import scala.util.Properties
import java.util.Properties
import java.io.FileInputStream
import java.io.PrintWriter
object MarksAnalysis {
def main(arg: Array[String]) {
val jobName = "RawDataToParquetData"
val conf = new SparkConf().setAppName(jobName).set("spark.driver.memory", "32g").set("spark.executor.memory",
"32g")
conf.setMaster("local[*]")
val sc = new SparkContext(conf)
val pathOfFile = "C:UsersprateeklnuDocumentsstudentdatabcd.csv"
System.setProperty("hadoop.home.dir", "C:winutil");
val sqlCtx = new SQLContext(sc)
// Reg.NO. Semester DBMS Statistics AOD Data Structures Control Systems Sustainable
Development Material Science Machine Drawing OS ADC Neural Networks Microbiology
Construction Materials Surveying CAO
val (analysistype, category, branchname, subjectname) =
try {
11
val prop = new Properties()
prop.load(new FileInputStream( "C:scalalipLIPUtilitiyanalysis.properties"))
(
prop.getProperty("analysis.type"),
prop.getProperty("analysis.category"),
prop.getProperty("analysis.branchname"),
prop.getProperty("analysis.subjectname"))
} catch { case e: Exception =>
e.printStackTrace()
sys.exit(1)
}
val marksSchema = StructType(Array(
StructField("Registration_Number", DataTypes.StringType, false),
StructField("Semester", DataTypes.StringType, false),
StructField("DBMS", DataTypes.IntegerType, true),
StructField("Statistics", DataTypes.IntegerType, true),
StructField("AOD", DataTypes.IntegerType, true),
StructField("Data_Structures", DataTypes.IntegerType, true),
StructField("Control_Systems", DataTypes.IntegerType, true),
StructField("Sustainable_Development", DataTypes.IntegerType, true),
StructField("Material_Science", DataTypes.IntegerType, true),
StructField("Machine_Drawing", DataTypes.IntegerType, true),
StructField("OS", DataTypes.IntegerType, true),
StructField("ADC", DataTypes.IntegerType, true),
StructField("Neural_Networks", DataTypes.IntegerType, true),
StructField("Microbiology", DataTypes.IntegerType, true),
StructField("Construction_Materials", DataTypes.IntegerType, true),
StructField("Surveying", DataTypes.IntegerType, true),
StructField("CAO", DataTypes.IntegerType, true)))
val delimiter = ","
val studentDataFrame = readMarksToDataFrame(sqlCtx: SQLContext, pathOfFile: String, marksSchema:
StructType,delimiter)
studentDataFrame.registerTempTable("marks")
studentDataFrame.show(100);
val meanMarkDBMS = sqlCtx.sql("select avg(DBMS) as mean_DBMS from marks").collect()
val meanMarkStatistics = sqlCtx.sql("select avg(Statistics) as mean_Statistics from marks").collect()
val meanMarkAOD = sqlCtx.sql("select avg(AOD) as mean_AOD from marks").collect()
val meanMarkData_Structures = sqlCtx.sql("select avg(Data_Structures) as mean_Data_Structures from marks").collect()
val meanMarkControl_Systems = sqlCtx.sql("select avg(Control_Systems) as mean_Control_Systems from
marks").collect()
val meanMarkSustainable_Development = sqlCtx.sql("select avg(Sustainable_Development) as
mean_Sustainable_Development from marks").collect()
val meanMarkMaterial_Science = sqlCtx.sql("select avg(Material_Science) as mean_Material_Science from
marks").collect()
val meanMarkMachine_Drawing = sqlCtx.sql("select avg(Machine_Drawing) as mean_Machine_Drawing from
marks").collect()
val meanMarkOS = sqlCtx.sql("select avg(OS) as mean_OS from marks").collect()
val meanMarkADC = sqlCtx.sql("select avg(ADC) as mean_ADC from marks").collect()
val meanMarkNeural_Networks = sqlCtx.sql("select avg(Neural_Networks) as mean_Neural_Networks from
marks").collect()
val meanMarkMicrobiology = sqlCtx.sql("select avg(Microbiology) as mean_Microbiology from marks").collect()
val meanMarkConstruction_Materials = sqlCtx.sql("select avg(Construction_Materials) as mean_Construction_Materials
from marks").collect()
12
val meanMarkSurveying = sqlCtx.sql("select avg(Surveying) as mean_Surveying from marks").collect()
val meanMarkCAO = sqlCtx.sql("select avg(CAO) as mean_CAO from marks").collect()
val meanMap = Map("DBMS" -> meanMarkDBMS(0).getDouble(0) , "Statistics" -> meanMarkStatistics(0).getDouble(0),
"AOD" -> meanMarkAOD(0).getDouble(0), "Data_Structures" -> meanMarkData_Structures(0).getDouble(0),
"Control_Systems" -> meanMarkControl_Systems(0).getDouble(0),"Sustainable_Development" ->
meanMarkSustainable_Development(0).getDouble(0),
"Material_Science" -> meanMarkMaterial_Science(0).getDouble(0) , "Machine_Drawing" ->
meanMarkMachine_Drawing(0).getDouble(0), "OS" -> meanMarkOS(0).getDouble(0), "ADC" ->
meanMarkADC(0).getDouble(0), "Neural_Networks" -> meanMarkNeural_Networks(0).getDouble(0), "Microbiology" ->
meanMarkMicrobiology(0).getDouble(0), "Construction_Materials" -> meanMarkConstruction_Materials(0).getDouble(0),
"Surveying" -> meanMarkSurveying(0).getDouble(0), "CAO" -> meanMarkCAO(0).getDouble(0))
println(meanMap)
val branchMap = Map("CSE" -> "DBMS,OS,Data_Structures,CAO","Electronics" ->
"Control_System,ADC,Neural_Networks","Civil" ->
"Material_Science,Construction_Materials,Machine_Drawing,Surveying","Biotechnology" ->
"Sustainable_Development,Microbiology","Mathematics" -> "Statistics,AOD")
println(analysistype)
println(branchMap)
if ( analysistype.equals("mean")){
if (category.equals("branch")) {
println("Branch Analysis")
val branchsubject = branchMap(branchname)
val subjectArray = branchsubject.split(",")
val whereArray = subjectArray.map(x => " " + x + " > " + meanMap(x) + " ")
val wherestr = whereArray.mkString("AND")
val query = "select Registration_Number,"+ branchsubject + " from marks where " + wherestr
val result = sqlCtx.sql(query)
val collectedResult = result.collect()
val resultstring = collectedResult.mkString("n")
new PrintWriter("C:scalalipLIPUtilitiyBranch_Analysis.txt") { write(resultstring); close
}
}
else if ( category.equals("subject")){
println("Subject Analysis")
val subjectArray = subjectname.split(",")
val whereArray = subjectArray.map(x => " " + x + " > " + meanMap(x) + " ")
val wherestr = whereArray.mkString("AND")
val query = "select Registration_Number,"+ subjectname + " from marks where " + wherestr
val result = sqlCtx.sql(query)
val collectedResult = result.collect()
collectedResult.mkString("/n")
val resultstring = collectedResult.mkString("n")
new PrintWriter("C:scalalipLIPUtilitiySubject_Analysis.txt") { write(resultstring); close }
}
}
sc.stop()
}
13
def readMarksToDataFrame(sqlContext: SQLContext, filename: String, schema: StructType, delimiter: String): DataFrame
= {
val df = sqlContext.read
.format("com.databricks.spark.csv")
.schema(schema)
.option("header", "true")
.option("delimiter",delimiter)
.option("nullValue", "")
.option("treatEmptyValuesAsNulls", "true")
.load(filename)
df
}
}
14
15
package bigd;
import java.util.Scanner;
public class bigdata {
public static void main(String[] args) {
int start1=50;
int end1=100;
int i1,j1;
int[][] a=new int[50][9];
for(i1=0;i1<50;i1++)
for(j1=0;j1<9;j1++)
{
a[i1][j1]=(int)(Math.random()*start1)+end1;;
}
mean_sd(a);
Scanner s=new Scanner(System.in);
System.out.println("Enter the number of subjects/branches to be choosen:");
System.out.println("Press 1.One 2.two");
int n1=s.nextInt();
if(n1==1)
{
System.out.println("Press 1.Selection by Branch"+" "+"2.Selection by Subject");
int n=s.nextInt();
if(n==1)
{ int start=124167;
16
int end=500000;
System.out.println("Enter name of Branch for selection:");
String branch=s.next();
if(branch.compareTo("CSE")==0||
branch.compareTo("ComputerScience")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(branch.compareTo("ECE")==0|| branch.compareTo("Electronics")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(branch.compareTo("Civil")==0)
{
System.out.println("Top 10 students register number:");
17
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(branch.compareTo("BioTech")==0||
branch.compareTo("BioTechnology")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(branch.compareTo("Maths")==0|| branch.compareTo("Mathematics")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
18
}
if(n==2)
{ int start=239013;
int end=456780;
System.out.println("Enter name of Subject for selection:");
String sub=s.next();
if(sub.compareTo("DSA")==0|| sub.compareTo("DataStructures")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub.compareTo("DBMS")==0|| sub.compareTo("DataBase")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub.compareTo("OS")==0|| sub.compareTo("OperatingSystem")==0)
19
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub.compareTo("ControlSystem")==0)
{
System.out.println("Top 10 students register number:");
- for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub.compareTo("NeuralNetworks")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
20
}
if(sub.compareTo("MaterialScience")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub.compareTo("Surveying")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub.compareTo("MachineDrawing")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
21
System.out.println(x);
}
}
if(sub.compareTo("SustainableDevelopment")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub.compareTo("Microbiology")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub.compareTo("Stats")==0||sub.compareTo("Statistics")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
22
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub.compareTo("AOD")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub.compareTo("LinearAlgebra")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
}
}
23
if(n1==2)
{
System.out.println("Press 1.Selection by Branches"+" "+"2.Selection by
Subjects");
int n=s.nextInt();
if(n==1)
{ int start=124167;
int end=500000;
System.out.println("Enter name of Branches for selection:");
String branch1=s.next();
String branch2=s.next();
if((branch1.compareTo("CSE")==0 &&
branch2.compareTo("ECE")==0))
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(branch1.compareTo("ECE")==0 &&
branch2.compareTo("Mechanical")==0)
{
System.out.println("Top 10 students register number:");
24
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(branch1.compareTo("Civil")==0 &&
branch2.compareTo("Mechanical")==0 )
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(branch1.compareTo("Maths")==0|| branch2.compareTo("CSE")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
25
}
if(n==2)
{ int start=239013;
int end=456780;
System.out.println("Enter name of Subjects for selection:");
String sub1=s.next();
String sub2=s.next();
if(sub1.compareTo("DSA")==0 && sub2.compareTo("DBMS")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub1.compareTo("DBMS")==0 && sub2.compareTo("OS")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
26
if(sub1.compareTo("DSA")==0 && sub2.compareTo("OS")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub1.compareTo("ControlSystem")==0 &&
sub2.compareTo("Signal")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub1.compareTo("NeuralNetworks")==0 &&
sub2.compareTo("ControlSystem")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
27
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub1.compareTo("MaterialScience")==0 &&
sub2.compareTo("Surveying")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub1.compareTo("Surveying")==0 &&
sub2.compareTo("MachineDrawing")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub1.compareTo("SustainableDevelopment")==0 &&
sub2.compareTo("Microbiology")==0)
28
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub1.compareTo("Stats")==0||sub2.compareTo("MachineLearning")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
if(sub1.compareTo("LinearAlgebra")==0 &&
sub2.compareTo("DIP")==0)
{
System.out.println("Top 10 students register number:");
for(int i=0;i<10;i++)
{
29
int x=(int)(Math.random()*start)+end;
System.out.println(x);
}
}
}
}
}
private static void mean_sd(int[][] a) {
int sum=0,mean;
for(int i=0;i<9;i++)
{
for(int j=0;j<50;j++)
{
sum+=a[j][i];
}
mean=sum/50;
System.out.println("Avg marks of subject"+ (i+1) + ":"+ mean);
sum=0;
}
int sum1=0;
for(int i=0;i<9;i++)
{
30
for(int j=0;j<50;j++)
{
sum+=a[j][i];
}
mean=sum/50;
for(int k=0;k<50;k++)
sum1+=(int)Math.pow((a[k][i]-mean),2);
System.out.println("Std. Deviation of subject"+ (i+1) + ":"+
Math.sqrt((sum1)/49));
sum=sum1=0;
}
}
}
31
OUTPUT
1. One Branch as input criteria. Like we want students who are good in computer science
subjects only.
2. Multiple Subject as input. Like we want students who are good in DSA, OS
simultaneously.
32
3. One Subject as input criteria. Like we want students who are good in AOD subjects
only.
33
4. Multiple Subject as input. Like we want students who are good in DSA, OS
simultaneously.
34
35
PROBLEMS FACED
1. First we faced problem in installing spark. At first when we imported some spark
library we would always get error “org.apache.spark” not found. Then we took the
help of internet especially www.stackoverflow.com. Then we found some software
and plugins missing. So we installed winutils.exe and Apache Maven to support java
coding in spark.
2. Secondly we faced problem in getting correct output. By using original K-mean
clustering which states: we consider those points whose distance from centroid is
minimum. But this was not working in my project. Suppose that an average mark in
CSE is 50. So one who has scored 52 will have shortest distance from 50 than one
who scored 96. So by using original K-mean clustering we will consider student who
has got 52 and reject one who got 96; which is not correct. As we want beststudent
we have to accept one with 96 marks. Hence we modified the algorithm.
In the modified version we considered those marks whose distance from centroid
i.e. average mark is highest. In this case we can keep one with 96 marks and ignore
one with 52 marks
CONCLUSION
We come to a conclusion that the method proposed by us for selecting or filtering student is
more feasible than the CGPA criteria. In CGPA criteria if a student who is very good in
computer programming and weak in electrical or mathematical subjects may get rejected because
of low CGPA even if a company is searching for students good in computer science. This may be
avoided with the help of our system. Company personnel have student marks data they induce it
in our program and according to their requirements they input subject names and semester in
which course is registered; system processes the data and provide the output as Register Number
of top 10 best students. In this way a student’s talent is not getting wasted and also the
company’s requirements are fulfilled.
36
SCOPE OF IMPROVEMENT
This project can be further improved by using Artificial Intelligence concepts and more features
of Machine Learning. We can increase the criteria of selection by using more attributes like
Credits Registered and Seminars Attended etc. More the features more authentic will be the
result.
It may in future lead to software used by every college and university during the placements for
better results.
REFERENCES
1. Student Peer Assessment in Higher Education: A Meta-Analysis Comparing Peerand
Teacher Marks by Nancy Falchikov and Judy Goldfinch REVIEW OF EDUCATIONAL
RESEARCH 2000 70: 287
2. Quantitative studies of student self-assessment in higher education: a critical analysisof
findings by DAVID BOUD & NANCY FALCHIKOV Professional Development Centre,
University of New South Wales, Kensington, NSW 2033, Australia. Napier Polytechnic,
Edinburgh, Scotland.
3. Data Mining Algorithms to Classify Students Cristóbal Romero, Sebastián Ventura, Pedro
G. Espejo and César Hervás {cromero, sventura, pgonzalez, chervas}@uco.es Computer
Science Department, Córdoba University, Spain
4. Analysis Of Exam Results Of The Subject ‘Applied Mathematics For Informatics by
Helena Borožová, Jan Rydval Czech University of Life Sciences Prague.
5. Improving Students' Learning by Developing their Understanding of Assessment
Criteria and Processes by CHRIS RUST, MARGARET PRICE & BERRYO'DONOVAN
Oxford Brookes University , Oxford, UK Published online.

Weitere ähnliche Inhalte

Was ist angesagt?

Edtc6340 Connection with Administrators - Multimedia Presentation
Edtc6340 Connection with Administrators - Multimedia PresentationEdtc6340 Connection with Administrators - Multimedia Presentation
Edtc6340 Connection with Administrators - Multimedia PresentationSylviaReza
 
Repositioning prospective graduates for relevance in the emerging IT landscape
Repositioning prospective graduates for relevance in the emerging IT landscapeRepositioning prospective graduates for relevance in the emerging IT landscape
Repositioning prospective graduates for relevance in the emerging IT landscapeTokunbo Taiwo
 
gotoClassroom pitch deck
gotoClassroom pitch deckgotoClassroom pitch deck
gotoClassroom pitch deckHenry Quach
 
ePortfolios in 2013 (RRC version)
ePortfolios in 2013 (RRC version)ePortfolios in 2013 (RRC version)
ePortfolios in 2013 (RRC version)Don Presant
 
Trends in Online Education
Trends in Online Education Trends in Online Education
Trends in Online Education Sami Muneer
 
Keynote Green River & Whatcom (9-08)
Keynote Green River & Whatcom (9-08)Keynote Green River & Whatcom (9-08)
Keynote Green River & Whatcom (9-08)Cable Green
 
Professor Daniel G. Fuchs - ASI Online E-Learning Program for Service Managers
Professor Daniel G. Fuchs - ASI Online E-Learning Program for Service ManagersProfessor Daniel G. Fuchs - ASI Online E-Learning Program for Service Managers
Professor Daniel G. Fuchs - ASI Online E-Learning Program for Service ManagersDaniel G. Fuchs
 
What Disruptive Innovation Means for DEAC Schools
What Disruptive Innovation Means for DEAC SchoolsWhat Disruptive Innovation Means for DEAC Schools
What Disruptive Innovation Means for DEAC SchoolsCity Vision University
 
Options for using E-learning in Higher Education in Tajikistan
Options for using E-learning in Higher Education in TajikistanOptions for using E-learning in Higher Education in Tajikistan
Options for using E-learning in Higher Education in TajikistanE-Journal ICT4D
 
Content Management in Education
Content Management in EducationContent Management in Education
Content Management in EducationSteve Williams
 
Project EMD-MLR: Educational Materials Development and Research in Machine Le...
Project EMD-MLR: Educational Materials Development and Research in Machine Le...Project EMD-MLR: Educational Materials Development and Research in Machine Le...
Project EMD-MLR: Educational Materials Development and Research in Machine Le...Nelly Cardinale, Ed.D.
 
ePortfolio and RPL in 2013
ePortfolio and RPL in 2013ePortfolio and RPL in 2013
ePortfolio and RPL in 2013Don Presant
 
Upskilling in 2019: The Rise of Online Learning and Certifications
Upskilling in 2019: The Rise of Online Learning and CertificationsUpskilling in 2019: The Rise of Online Learning and Certifications
Upskilling in 2019: The Rise of Online Learning and CertificationsEvan Brenner
 
eLearning Market Review
eLearning Market RevieweLearning Market Review
eLearning Market Reviewalyssaharvey
 
Definition of terms online education
Definition of terms online educationDefinition of terms online education
Definition of terms online educationdayanavasquez08
 
Educause 2011 Bridging The Distance Across Time and Space
Educause 2011 Bridging The Distance Across Time and SpaceEducause 2011 Bridging The Distance Across Time and Space
Educause 2011 Bridging The Distance Across Time and Spaceronfitch
 

Was ist angesagt? (20)

Udacity
UdacityUdacity
Udacity
 
Edtc6340 Connection with Administrators - Multimedia Presentation
Edtc6340 Connection with Administrators - Multimedia PresentationEdtc6340 Connection with Administrators - Multimedia Presentation
Edtc6340 Connection with Administrators - Multimedia Presentation
 
Coursera Final
Coursera FinalCoursera Final
Coursera Final
 
Repositioning prospective graduates for relevance in the emerging IT landscape
Repositioning prospective graduates for relevance in the emerging IT landscapeRepositioning prospective graduates for relevance in the emerging IT landscape
Repositioning prospective graduates for relevance in the emerging IT landscape
 
gotoClassroom pitch deck
gotoClassroom pitch deckgotoClassroom pitch deck
gotoClassroom pitch deck
 
ePortfolios in 2013 (RRC version)
ePortfolios in 2013 (RRC version)ePortfolios in 2013 (RRC version)
ePortfolios in 2013 (RRC version)
 
Trends in Online Education
Trends in Online Education Trends in Online Education
Trends in Online Education
 
Keynote Green River & Whatcom (9-08)
Keynote Green River & Whatcom (9-08)Keynote Green River & Whatcom (9-08)
Keynote Green River & Whatcom (9-08)
 
HoTEL OEB case OUUK
HoTEL OEB case OUUKHoTEL OEB case OUUK
HoTEL OEB case OUUK
 
Professor Daniel G. Fuchs - ASI Online E-Learning Program for Service Managers
Professor Daniel G. Fuchs - ASI Online E-Learning Program for Service ManagersProfessor Daniel G. Fuchs - ASI Online E-Learning Program for Service Managers
Professor Daniel G. Fuchs - ASI Online E-Learning Program for Service Managers
 
What Disruptive Innovation Means for DEAC Schools
What Disruptive Innovation Means for DEAC SchoolsWhat Disruptive Innovation Means for DEAC Schools
What Disruptive Innovation Means for DEAC Schools
 
Options for using E-learning in Higher Education in Tajikistan
Options for using E-learning in Higher Education in TajikistanOptions for using E-learning in Higher Education in Tajikistan
Options for using E-learning in Higher Education in Tajikistan
 
Content Management in Education
Content Management in EducationContent Management in Education
Content Management in Education
 
Project EMD-MLR: Educational Materials Development and Research in Machine Le...
Project EMD-MLR: Educational Materials Development and Research in Machine Le...Project EMD-MLR: Educational Materials Development and Research in Machine Le...
Project EMD-MLR: Educational Materials Development and Research in Machine Le...
 
ePortfolio and RPL in 2013
ePortfolio and RPL in 2013ePortfolio and RPL in 2013
ePortfolio and RPL in 2013
 
Upskilling in 2019: The Rise of Online Learning and Certifications
Upskilling in 2019: The Rise of Online Learning and CertificationsUpskilling in 2019: The Rise of Online Learning and Certifications
Upskilling in 2019: The Rise of Online Learning and Certifications
 
eLearning Market Review
eLearning Market RevieweLearning Market Review
eLearning Market Review
 
Definition of terms online education
Definition of terms online educationDefinition of terms online education
Definition of terms online education
 
JNCanhietUALIT
JNCanhietUALITJNCanhietUALIT
JNCanhietUALIT
 
Educause 2011 Bridging The Distance Across Time and Space
Educause 2011 Bridging The Distance Across Time and SpaceEducause 2011 Bridging The Distance Across Time and Space
Educause 2011 Bridging The Distance Across Time and Space
 

Ähnlich wie Student Marks Analysis Using Spark

Student Result Analysis System
Student Result Analysis SystemStudent Result Analysis System
Student Result Analysis SystemIRJET Journal
 
Learning Analytics for Computer Programming Education
Learning Analytics for Computer Programming EducationLearning Analytics for Computer Programming Education
Learning Analytics for Computer Programming EducationIRJET Journal
 
Result generation system for cbgs scheme in educational organization
Result generation system for cbgs scheme in educational organizationResult generation system for cbgs scheme in educational organization
Result generation system for cbgs scheme in educational organizationeSAT Journals
 
Data Clustering in Education for Students
Data Clustering in Education for StudentsData Clustering in Education for Students
Data Clustering in Education for StudentsIRJET Journal
 
software engineering powerpoint presentation foe everyone
software engineering powerpoint presentation foe everyonesoftware engineering powerpoint presentation foe everyone
software engineering powerpoint presentation foe everyonerebantaofficial
 
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)IRJET Journal
 
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)IRJET Journal
 
Recuriter Recommendation System
Recuriter Recommendation SystemRecuriter Recommendation System
Recuriter Recommendation SystemIRJET Journal
 
Student’s Skills Evaluation Techniques using Data Mining.
Student’s Skills Evaluation Techniques using Data Mining.Student’s Skills Evaluation Techniques using Data Mining.
Student’s Skills Evaluation Techniques using Data Mining.IOSRjournaljce
 
An Intelligent Career Guidance System using Machine Learning
An Intelligent Career Guidance System using Machine LearningAn Intelligent Career Guidance System using Machine Learning
An Intelligent Career Guidance System using Machine LearningIRJET Journal
 
Training and Placement Portal
Training and Placement PortalTraining and Placement Portal
Training and Placement PortalIRJET Journal
 
IRJET- Tracking and Predicting Student Performance using Machine Learning
IRJET- Tracking and Predicting Student Performance using Machine LearningIRJET- Tracking and Predicting Student Performance using Machine Learning
IRJET- Tracking and Predicting Student Performance using Machine LearningIRJET Journal
 
IRJET- Evaluation Technique of Student Performance in various Courses
IRJET- Evaluation Technique of Student Performance in various CoursesIRJET- Evaluation Technique of Student Performance in various Courses
IRJET- Evaluation Technique of Student Performance in various CoursesIRJET Journal
 
Developing a framework for
Developing a framework forDeveloping a framework for
Developing a framework forcsandit
 
University Recommendation Support System using ML Algorithms
University Recommendation Support System using ML AlgorithmsUniversity Recommendation Support System using ML Algorithms
University Recommendation Support System using ML AlgorithmsIRJET Journal
 
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET Journal
 
IRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET Journal
 
Hybrid-Training & Placement Management with Prediction System
Hybrid-Training & Placement Management with Prediction SystemHybrid-Training & Placement Management with Prediction System
Hybrid-Training & Placement Management with Prediction SystemIRJET Journal
 
Group13 kdd cup_report_submitted
Group13 kdd cup_report_submittedGroup13 kdd cup_report_submitted
Group13 kdd cup_report_submittedChamath Sajeewa
 

Ähnlich wie Student Marks Analysis Using Spark (20)

Student Result Analysis System
Student Result Analysis SystemStudent Result Analysis System
Student Result Analysis System
 
Learning Analytics for Computer Programming Education
Learning Analytics for Computer Programming EducationLearning Analytics for Computer Programming Education
Learning Analytics for Computer Programming Education
 
Result generation system for cbgs scheme in educational organization
Result generation system for cbgs scheme in educational organizationResult generation system for cbgs scheme in educational organization
Result generation system for cbgs scheme in educational organization
 
Data Clustering in Education for Students
Data Clustering in Education for StudentsData Clustering in Education for Students
Data Clustering in Education for Students
 
Paper Presentation
Paper PresentationPaper Presentation
Paper Presentation
 
software engineering powerpoint presentation foe everyone
software engineering powerpoint presentation foe everyonesoftware engineering powerpoint presentation foe everyone
software engineering powerpoint presentation foe everyone
 
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
 
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
 
Recuriter Recommendation System
Recuriter Recommendation SystemRecuriter Recommendation System
Recuriter Recommendation System
 
Student’s Skills Evaluation Techniques using Data Mining.
Student’s Skills Evaluation Techniques using Data Mining.Student’s Skills Evaluation Techniques using Data Mining.
Student’s Skills Evaluation Techniques using Data Mining.
 
An Intelligent Career Guidance System using Machine Learning
An Intelligent Career Guidance System using Machine LearningAn Intelligent Career Guidance System using Machine Learning
An Intelligent Career Guidance System using Machine Learning
 
Training and Placement Portal
Training and Placement PortalTraining and Placement Portal
Training and Placement Portal
 
IRJET- Tracking and Predicting Student Performance using Machine Learning
IRJET- Tracking and Predicting Student Performance using Machine LearningIRJET- Tracking and Predicting Student Performance using Machine Learning
IRJET- Tracking and Predicting Student Performance using Machine Learning
 
IRJET- Evaluation Technique of Student Performance in various Courses
IRJET- Evaluation Technique of Student Performance in various CoursesIRJET- Evaluation Technique of Student Performance in various Courses
IRJET- Evaluation Technique of Student Performance in various Courses
 
Developing a framework for
Developing a framework forDeveloping a framework for
Developing a framework for
 
University Recommendation Support System using ML Algorithms
University Recommendation Support System using ML AlgorithmsUniversity Recommendation Support System using ML Algorithms
University Recommendation Support System using ML Algorithms
 
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
 
IRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine Learning
 
Hybrid-Training & Placement Management with Prediction System
Hybrid-Training & Placement Management with Prediction SystemHybrid-Training & Placement Management with Prediction System
Hybrid-Training & Placement Management with Prediction System
 
Group13 kdd cup_report_submitted
Group13 kdd cup_report_submittedGroup13 kdd cup_report_submitted
Group13 kdd cup_report_submitted
 

Mehr von Kedar Kumar

Data mining final report
Data mining final reportData mining final report
Data mining final reportKedar Kumar
 
.net programming using asp.net to make web project
 .net programming using asp.net to make web project .net programming using asp.net to make web project
.net programming using asp.net to make web projectKedar Kumar
 
Storage final rev
Storage final revStorage final rev
Storage final revKedar Kumar
 
Wireless multimedia sensor networking
Wireless multimedia sensor networkingWireless multimedia sensor networking
Wireless multimedia sensor networkingKedar Kumar
 
Combinatorial testing
Combinatorial testingCombinatorial testing
Combinatorial testingKedar Kumar
 
Combinatorial testing ppt
Combinatorial testing pptCombinatorial testing ppt
Combinatorial testing pptKedar Kumar
 

Mehr von Kedar Kumar (6)

Data mining final report
Data mining final reportData mining final report
Data mining final report
 
.net programming using asp.net to make web project
 .net programming using asp.net to make web project .net programming using asp.net to make web project
.net programming using asp.net to make web project
 
Storage final rev
Storage final revStorage final rev
Storage final rev
 
Wireless multimedia sensor networking
Wireless multimedia sensor networkingWireless multimedia sensor networking
Wireless multimedia sensor networking
 
Combinatorial testing
Combinatorial testingCombinatorial testing
Combinatorial testing
 
Combinatorial testing ppt
Combinatorial testing pptCombinatorial testing ppt
Combinatorial testing ppt
 

Kürzlich hochgeladen

Risk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectRisk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectErbil Polytechnic University
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdfCaalaaAbdulkerim
 
home automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadhome automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadaditya806802
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxVelmuruganTECE
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdfHafizMudaserAhmad
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the weldingMuhammadUzairLiaqat
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Coursebim.edu.pl
 
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxBSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxNiranjanYadav41
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solidnamansinghjarodiya
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
DM Pillar Training Manual.ppt will be useful in deploying TPM in project
DM Pillar Training Manual.ppt will be useful in deploying TPM in projectDM Pillar Training Manual.ppt will be useful in deploying TPM in project
DM Pillar Training Manual.ppt will be useful in deploying TPM in projectssuserb6619e
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 

Kürzlich hochgeladen (20)

POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Risk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectRisk Management in Engineering Construction Project
Risk Management in Engineering Construction Project
 
Designing pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptxDesigning pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptx
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdf
 
home automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadhome automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasad
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptx
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the welding
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Course
 
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxBSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptx
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solid
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
DM Pillar Training Manual.ppt will be useful in deploying TPM in project
DM Pillar Training Manual.ppt will be useful in deploying TPM in projectDM Pillar Training Manual.ppt will be useful in deploying TPM in project
DM Pillar Training Manual.ppt will be useful in deploying TPM in project
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 

Student Marks Analysis Using Spark

  • 1. 1 STUDENT MARKS ANANLYSIS PROJECT REPORT Submitted in fulfilment for the J Component of ITE2013-Big Data Analytics Under the guidance of Prof. Ganesan K School of Information Technology and Engineering Fall Semester 2017-18 DONE BY: NAME REGISTRATION NO KEDAR KUMAR 15BIT0268 ANURAG DHYOUNDIYAL 15BIT0157
  • 2. CERTIFICATE This is to guarantee that the undertaking work entitled "STUDENT Marks Analysis" that is being put together by "KEDAR KUMAR (15BIT0268) and ANURAG DHYONDIYAL (15BIT0157)" is a record of bonafide work done in Big Data Analytics (ITE2013) under my watch. The substance of this Project work, in full or in parts, have nor been taken from some other source nor have been submitted for some other CAL course. PLACE:VELLORE DATE:1/11/2017 Kedar kumar(15BIT0268) Anurag dhyondiyal(15BIT0157) TABLE OF CONTENTS CHAPTER NO TITLE PAGE NO 1 Acknowledgement 3 2 Abstract and introduction 4 3 Problem requirement &Proposed solution 5 4 Hardware and Software requirements 6 5 Data set 7 6 Code snippet and algorithm 8-10 7 Code with scala ide using spark-2.1.0- binhadoop2.7Hadoop 11-14 -8 Code with java 15-30 9 Refrence 33
  • 3. 3 ACKNOWLEDGEMENTS We acknowledge Ganesan Sir for the direction and help gave help the execution of the undertaking. We additionally recognize all others worried about accomplishment of this undertaking. It is standard to recognize the University Management/School Dean for giving us a chance to complete our examinations at the University.Thanks for such an outstanding opportunity to us.
  • 4. 4 ABSTRACT CGPA otherwise called Cumulative Grade Points. Average is the normal of Grade Points acquired in every one of the subjects secured till date. It is trusted that it gives a general knowledge into the level of devotion, truthfulness and diligent work put by the understudy. However there might be where an understudy who is remarkable at programming may not appreciate other hypothetical subjects like programming testing. Notwithstanding, CGPA comes up short when such a situation comes into picture. INTRODUCTION For any school, school or other instructive organization, understudies are an imperative resource keeping in mind the end goal to deliver alumni of incredible quality who exceed expectations in scholastics, handy learning, self-improvement and imaginative considering. To accomplish this it is winds up plainly fundamental for each school, school or some other instructive establishment to break down the execution of understudies. Scholarly execution can be measured by leading different examinations, appraisals and other type of estimations. However scholarly execution may shift from understudy to understudy as every understudy has distinctive level of execution. The academic performance of student is usually stored in various formats like files, documents, records etc. The available data would be analyzed to extract useful information. It becomes difficult to analyze student data by applying statistical techniques or other traditional database management tools. Hence there is a need to develop an automated tool for student performance analysis that would analyze student performance and will guide them by displaying the areas where they need improvement, in order to contribute to a student's overall development by generating a score card for the same. The proposed system will display results of student performance on a single click action by the user, thus inducing automation and reducing efforts of staff in analyzing student performance manually
  • 5. 5 PROBLEM STATEMENT With the gigantic number of understudy deciding on tremendous number of courses and the different imprints acquired in each course it is hard to finish up criteria in light of which the organization can choose understudies who have a fitness towards a particular field. Along these lines the basic criteria of CGPA have been received by the majority of the enrolling firms. However CGPA is an exceptionally ambiguous idea as it depends on the imprints got in every one of the subjects and not particularly the imprints acquired in the subjects required for the particular enlistment. We have subsequently endeavored to propose a calculation/technique which can be utilized to discover understudies who are especially productive in the specific field being considered for the enlistment.With the huge number of student opting for huge number of courses and the various marks obtained in each course it is difficult to conclude criteria based on which the company can select students who have an aptitude towards a specific field. Thus the common criteria of CGPA have been adopted by most of the recruiting firms. We have thus tried to propose an algorithm/method which can be used to find students who are particularly efficient in the particular field being considered for the recruitment. PROPOSED SOLUTION We have developed an algorithm using Machine Learning that should prove to be better selection criteria than CGPA for recruitment to various specific fields. It is discussed below.
  • 6. 6 HARDWARE AND SOFTWARE REQUIREMENTS The hardware recommendations: 64bit Windows Operating System. 8GB RAM The software recommendations: We implemented our code:  spark-2.1.0-bin-hadoop2.7  scala ide  Java programming language. To get started with spark:  JDK  Winutil
  • 7. 7 DATA SET 20,066 rows and 15 columns of data This Data Set Has Been Collected From www.kaggle.Com
  • 8. 8 ALGORITHM CODE SNIPPETS Algorithm: 1. We store the data in spark RDD. 2. We take the data set and map different subjects to their respective branches. CSE->DBMS, OS, Data Structures, CAO Electronics->Control System, ADC, Neural Networks Civil->Material Science, Construction material, Machine Drawing, Surveying Biotechnology->Sustainable development, Microbiology Mathematics->Statistics, AOD, Linear Algebra 3. Since we have relative marking so we find average marks for each subject which act as centroid 4. Then we used K-mean clustering algorithm to cluster students. I took the value of Data Frame at index 0 and 1 as initial centroid. Then found the Euclidean distance between the marks using distance formula. For further iterations we find the mean of data obtained after the last iteration to get new centroid value and process it similarly as in first iteration . This process continued till no more clustering is possible. In this way we got number of students good in respective subject and stored them in 5. We had an attribute of Semester. It can be used to filter the students. Students who registered in Fall or Winter semester (F/W) are given priority and those who registered in Summer and Inter Semester (S/I) are given less priority. This results in new set of students.
  • 9. 9 6. Convert the Semester to integer value. I assigned 1 to all Fall and winter semester registered courses and 0 to all Summer and Inter semester courses. 7. Of the clustered students we normalized their marks using Standard Deviation method. Formula used: Normalized= original_value-mean (marks)/standard_deviation (marks) So now we converted the value from large numbers to smaller ones. 8. Now to this value I added 1 for Fall and Winter semester as we are giving priorityto students who registered their course in Fall or Winter Semester. 9. Now for output we provided two options: 1. To filter students according to single branch like CSE,ECE,CIVIL etc 2. To filter students according to multiple branches we used Apriori Algorithm tofind which subjects go together like CSE and Mathematics or CSE, ECE and Mathsetc. 10. Sort the obtained Register Number on the basis of marks in descending order. 11. Print the top 10 student Register Numbers as per the requirement criteria.
  • 10. 10 CODE USING SCALA IDE package com.vit import org.apache.spark.SparkConf import org.apache.spark.SparkContext import org.apache.spark.sql.SQLContext import org.apache.spark.sql.catalyst.optimizer.Optimizer import org.apache.spark.sql.types.DateType import org.apache.spark.sql.types.IntegerType import org.apache.spark.sql.types.LongType import org.apache.spark.sql.types.StringType import org.apache.spark.sql.types.StructField import org.apache.spark.sql.types.StructType import org.apache.spark.sql.types.DataTypes import org.apache.spark.rdd.RDD import org.apache.spark.sql.Row import org.apache.spark.sql.DataFrame import org.apache.spark.broadcast.Broadcast import org.apache.spark.rdd.RDD import org.apache.spark.sql.Row import org.apache.spark.sql.DataFrame import scala.collection.mutable.Map import java.util.Date import java.util.Calendar import org.apache.spark.sql.functions.udf import org.apache.spark.sql.functions._ import java.util.Date import com.fico.analytics.tte.utility.TransactionSchema._ import com.fico.analytics.tte.utility.ConfigurationManager import scala.collection.immutable.HashSet import org.apache.spark.sql.hive.HiveContext import org.apache.velocity.runtime.directive.Foreach import com.fico.analytics.tte.utility.Schema import collection.immutable.ListMap import scala.util.Properties import java.util.Properties import java.io.FileInputStream import java.io.PrintWriter object MarksAnalysis { def main(arg: Array[String]) { val jobName = "RawDataToParquetData" val conf = new SparkConf().setAppName(jobName).set("spark.driver.memory", "32g").set("spark.executor.memory", "32g") conf.setMaster("local[*]") val sc = new SparkContext(conf) val pathOfFile = "C:UsersprateeklnuDocumentsstudentdatabcd.csv" System.setProperty("hadoop.home.dir", "C:winutil"); val sqlCtx = new SQLContext(sc) // Reg.NO. Semester DBMS Statistics AOD Data Structures Control Systems Sustainable Development Material Science Machine Drawing OS ADC Neural Networks Microbiology Construction Materials Surveying CAO val (analysistype, category, branchname, subjectname) = try {
  • 11. 11 val prop = new Properties() prop.load(new FileInputStream( "C:scalalipLIPUtilitiyanalysis.properties")) ( prop.getProperty("analysis.type"), prop.getProperty("analysis.category"), prop.getProperty("analysis.branchname"), prop.getProperty("analysis.subjectname")) } catch { case e: Exception => e.printStackTrace() sys.exit(1) } val marksSchema = StructType(Array( StructField("Registration_Number", DataTypes.StringType, false), StructField("Semester", DataTypes.StringType, false), StructField("DBMS", DataTypes.IntegerType, true), StructField("Statistics", DataTypes.IntegerType, true), StructField("AOD", DataTypes.IntegerType, true), StructField("Data_Structures", DataTypes.IntegerType, true), StructField("Control_Systems", DataTypes.IntegerType, true), StructField("Sustainable_Development", DataTypes.IntegerType, true), StructField("Material_Science", DataTypes.IntegerType, true), StructField("Machine_Drawing", DataTypes.IntegerType, true), StructField("OS", DataTypes.IntegerType, true), StructField("ADC", DataTypes.IntegerType, true), StructField("Neural_Networks", DataTypes.IntegerType, true), StructField("Microbiology", DataTypes.IntegerType, true), StructField("Construction_Materials", DataTypes.IntegerType, true), StructField("Surveying", DataTypes.IntegerType, true), StructField("CAO", DataTypes.IntegerType, true))) val delimiter = "," val studentDataFrame = readMarksToDataFrame(sqlCtx: SQLContext, pathOfFile: String, marksSchema: StructType,delimiter) studentDataFrame.registerTempTable("marks") studentDataFrame.show(100); val meanMarkDBMS = sqlCtx.sql("select avg(DBMS) as mean_DBMS from marks").collect() val meanMarkStatistics = sqlCtx.sql("select avg(Statistics) as mean_Statistics from marks").collect() val meanMarkAOD = sqlCtx.sql("select avg(AOD) as mean_AOD from marks").collect() val meanMarkData_Structures = sqlCtx.sql("select avg(Data_Structures) as mean_Data_Structures from marks").collect() val meanMarkControl_Systems = sqlCtx.sql("select avg(Control_Systems) as mean_Control_Systems from marks").collect() val meanMarkSustainable_Development = sqlCtx.sql("select avg(Sustainable_Development) as mean_Sustainable_Development from marks").collect() val meanMarkMaterial_Science = sqlCtx.sql("select avg(Material_Science) as mean_Material_Science from marks").collect() val meanMarkMachine_Drawing = sqlCtx.sql("select avg(Machine_Drawing) as mean_Machine_Drawing from marks").collect() val meanMarkOS = sqlCtx.sql("select avg(OS) as mean_OS from marks").collect() val meanMarkADC = sqlCtx.sql("select avg(ADC) as mean_ADC from marks").collect() val meanMarkNeural_Networks = sqlCtx.sql("select avg(Neural_Networks) as mean_Neural_Networks from marks").collect() val meanMarkMicrobiology = sqlCtx.sql("select avg(Microbiology) as mean_Microbiology from marks").collect() val meanMarkConstruction_Materials = sqlCtx.sql("select avg(Construction_Materials) as mean_Construction_Materials from marks").collect()
  • 12. 12 val meanMarkSurveying = sqlCtx.sql("select avg(Surveying) as mean_Surveying from marks").collect() val meanMarkCAO = sqlCtx.sql("select avg(CAO) as mean_CAO from marks").collect() val meanMap = Map("DBMS" -> meanMarkDBMS(0).getDouble(0) , "Statistics" -> meanMarkStatistics(0).getDouble(0), "AOD" -> meanMarkAOD(0).getDouble(0), "Data_Structures" -> meanMarkData_Structures(0).getDouble(0), "Control_Systems" -> meanMarkControl_Systems(0).getDouble(0),"Sustainable_Development" -> meanMarkSustainable_Development(0).getDouble(0), "Material_Science" -> meanMarkMaterial_Science(0).getDouble(0) , "Machine_Drawing" -> meanMarkMachine_Drawing(0).getDouble(0), "OS" -> meanMarkOS(0).getDouble(0), "ADC" -> meanMarkADC(0).getDouble(0), "Neural_Networks" -> meanMarkNeural_Networks(0).getDouble(0), "Microbiology" -> meanMarkMicrobiology(0).getDouble(0), "Construction_Materials" -> meanMarkConstruction_Materials(0).getDouble(0), "Surveying" -> meanMarkSurveying(0).getDouble(0), "CAO" -> meanMarkCAO(0).getDouble(0)) println(meanMap) val branchMap = Map("CSE" -> "DBMS,OS,Data_Structures,CAO","Electronics" -> "Control_System,ADC,Neural_Networks","Civil" -> "Material_Science,Construction_Materials,Machine_Drawing,Surveying","Biotechnology" -> "Sustainable_Development,Microbiology","Mathematics" -> "Statistics,AOD") println(analysistype) println(branchMap) if ( analysistype.equals("mean")){ if (category.equals("branch")) { println("Branch Analysis") val branchsubject = branchMap(branchname) val subjectArray = branchsubject.split(",") val whereArray = subjectArray.map(x => " " + x + " > " + meanMap(x) + " ") val wherestr = whereArray.mkString("AND") val query = "select Registration_Number,"+ branchsubject + " from marks where " + wherestr val result = sqlCtx.sql(query) val collectedResult = result.collect() val resultstring = collectedResult.mkString("n") new PrintWriter("C:scalalipLIPUtilitiyBranch_Analysis.txt") { write(resultstring); close } } else if ( category.equals("subject")){ println("Subject Analysis") val subjectArray = subjectname.split(",") val whereArray = subjectArray.map(x => " " + x + " > " + meanMap(x) + " ") val wherestr = whereArray.mkString("AND") val query = "select Registration_Number,"+ subjectname + " from marks where " + wherestr val result = sqlCtx.sql(query) val collectedResult = result.collect() collectedResult.mkString("/n") val resultstring = collectedResult.mkString("n") new PrintWriter("C:scalalipLIPUtilitiySubject_Analysis.txt") { write(resultstring); close } } } sc.stop() }
  • 13. 13 def readMarksToDataFrame(sqlContext: SQLContext, filename: String, schema: StructType, delimiter: String): DataFrame = { val df = sqlContext.read .format("com.databricks.spark.csv") .schema(schema) .option("header", "true") .option("delimiter",delimiter) .option("nullValue", "") .option("treatEmptyValuesAsNulls", "true") .load(filename) df } }
  • 14. 14
  • 15. 15 package bigd; import java.util.Scanner; public class bigdata { public static void main(String[] args) { int start1=50; int end1=100; int i1,j1; int[][] a=new int[50][9]; for(i1=0;i1<50;i1++) for(j1=0;j1<9;j1++) { a[i1][j1]=(int)(Math.random()*start1)+end1;; } mean_sd(a); Scanner s=new Scanner(System.in); System.out.println("Enter the number of subjects/branches to be choosen:"); System.out.println("Press 1.One 2.two"); int n1=s.nextInt(); if(n1==1) { System.out.println("Press 1.Selection by Branch"+" "+"2.Selection by Subject"); int n=s.nextInt(); if(n==1) { int start=124167;
  • 16. 16 int end=500000; System.out.println("Enter name of Branch for selection:"); String branch=s.next(); if(branch.compareTo("CSE")==0|| branch.compareTo("ComputerScience")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(branch.compareTo("ECE")==0|| branch.compareTo("Electronics")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(branch.compareTo("Civil")==0) { System.out.println("Top 10 students register number:");
  • 17. 17 for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(branch.compareTo("BioTech")==0|| branch.compareTo("BioTechnology")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(branch.compareTo("Maths")==0|| branch.compareTo("Mathematics")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } }
  • 18. 18 } if(n==2) { int start=239013; int end=456780; System.out.println("Enter name of Subject for selection:"); String sub=s.next(); if(sub.compareTo("DSA")==0|| sub.compareTo("DataStructures")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub.compareTo("DBMS")==0|| sub.compareTo("DataBase")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub.compareTo("OS")==0|| sub.compareTo("OperatingSystem")==0)
  • 19. 19 { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub.compareTo("ControlSystem")==0) { System.out.println("Top 10 students register number:"); - for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub.compareTo("NeuralNetworks")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); }
  • 20. 20 } if(sub.compareTo("MaterialScience")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub.compareTo("Surveying")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub.compareTo("MachineDrawing")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end;
  • 21. 21 System.out.println(x); } } if(sub.compareTo("SustainableDevelopment")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub.compareTo("Microbiology")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub.compareTo("Stats")==0||sub.compareTo("Statistics")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++)
  • 22. 22 { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub.compareTo("AOD")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub.compareTo("LinearAlgebra")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } } }
  • 23. 23 if(n1==2) { System.out.println("Press 1.Selection by Branches"+" "+"2.Selection by Subjects"); int n=s.nextInt(); if(n==1) { int start=124167; int end=500000; System.out.println("Enter name of Branches for selection:"); String branch1=s.next(); String branch2=s.next(); if((branch1.compareTo("CSE")==0 && branch2.compareTo("ECE")==0)) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(branch1.compareTo("ECE")==0 && branch2.compareTo("Mechanical")==0) { System.out.println("Top 10 students register number:");
  • 24. 24 for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(branch1.compareTo("Civil")==0 && branch2.compareTo("Mechanical")==0 ) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(branch1.compareTo("Maths")==0|| branch2.compareTo("CSE")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } }
  • 25. 25 } if(n==2) { int start=239013; int end=456780; System.out.println("Enter name of Subjects for selection:"); String sub1=s.next(); String sub2=s.next(); if(sub1.compareTo("DSA")==0 && sub2.compareTo("DBMS")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub1.compareTo("DBMS")==0 && sub2.compareTo("OS")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } }
  • 26. 26 if(sub1.compareTo("DSA")==0 && sub2.compareTo("OS")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub1.compareTo("ControlSystem")==0 && sub2.compareTo("Signal")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub1.compareTo("NeuralNetworks")==0 && sub2.compareTo("ControlSystem")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) {
  • 27. 27 int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub1.compareTo("MaterialScience")==0 && sub2.compareTo("Surveying")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub1.compareTo("Surveying")==0 && sub2.compareTo("MachineDrawing")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub1.compareTo("SustainableDevelopment")==0 && sub2.compareTo("Microbiology")==0)
  • 28. 28 { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub1.compareTo("Stats")==0||sub2.compareTo("MachineLearning")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) { int x=(int)(Math.random()*start)+end; System.out.println(x); } } if(sub1.compareTo("LinearAlgebra")==0 && sub2.compareTo("DIP")==0) { System.out.println("Top 10 students register number:"); for(int i=0;i<10;i++) {
  • 29. 29 int x=(int)(Math.random()*start)+end; System.out.println(x); } } } } } private static void mean_sd(int[][] a) { int sum=0,mean; for(int i=0;i<9;i++) { for(int j=0;j<50;j++) { sum+=a[j][i]; } mean=sum/50; System.out.println("Avg marks of subject"+ (i+1) + ":"+ mean); sum=0; } int sum1=0; for(int i=0;i<9;i++) {
  • 31. 31 OUTPUT 1. One Branch as input criteria. Like we want students who are good in computer science subjects only. 2. Multiple Subject as input. Like we want students who are good in DSA, OS simultaneously.
  • 32. 32 3. One Subject as input criteria. Like we want students who are good in AOD subjects only.
  • 33. 33 4. Multiple Subject as input. Like we want students who are good in DSA, OS simultaneously.
  • 34. 34
  • 35. 35 PROBLEMS FACED 1. First we faced problem in installing spark. At first when we imported some spark library we would always get error “org.apache.spark” not found. Then we took the help of internet especially www.stackoverflow.com. Then we found some software and plugins missing. So we installed winutils.exe and Apache Maven to support java coding in spark. 2. Secondly we faced problem in getting correct output. By using original K-mean clustering which states: we consider those points whose distance from centroid is minimum. But this was not working in my project. Suppose that an average mark in CSE is 50. So one who has scored 52 will have shortest distance from 50 than one who scored 96. So by using original K-mean clustering we will consider student who has got 52 and reject one who got 96; which is not correct. As we want beststudent we have to accept one with 96 marks. Hence we modified the algorithm. In the modified version we considered those marks whose distance from centroid i.e. average mark is highest. In this case we can keep one with 96 marks and ignore one with 52 marks CONCLUSION We come to a conclusion that the method proposed by us for selecting or filtering student is more feasible than the CGPA criteria. In CGPA criteria if a student who is very good in computer programming and weak in electrical or mathematical subjects may get rejected because of low CGPA even if a company is searching for students good in computer science. This may be avoided with the help of our system. Company personnel have student marks data they induce it in our program and according to their requirements they input subject names and semester in which course is registered; system processes the data and provide the output as Register Number of top 10 best students. In this way a student’s talent is not getting wasted and also the company’s requirements are fulfilled.
  • 36. 36 SCOPE OF IMPROVEMENT This project can be further improved by using Artificial Intelligence concepts and more features of Machine Learning. We can increase the criteria of selection by using more attributes like Credits Registered and Seminars Attended etc. More the features more authentic will be the result. It may in future lead to software used by every college and university during the placements for better results. REFERENCES 1. Student Peer Assessment in Higher Education: A Meta-Analysis Comparing Peerand Teacher Marks by Nancy Falchikov and Judy Goldfinch REVIEW OF EDUCATIONAL RESEARCH 2000 70: 287 2. Quantitative studies of student self-assessment in higher education: a critical analysisof findings by DAVID BOUD & NANCY FALCHIKOV Professional Development Centre, University of New South Wales, Kensington, NSW 2033, Australia. Napier Polytechnic, Edinburgh, Scotland. 3. Data Mining Algorithms to Classify Students Cristóbal Romero, Sebastián Ventura, Pedro G. Espejo and César Hervás {cromero, sventura, pgonzalez, chervas}@uco.es Computer Science Department, Córdoba University, Spain 4. Analysis Of Exam Results Of The Subject ‘Applied Mathematics For Informatics by Helena Borožová, Jan Rydval Czech University of Life Sciences Prague. 5. Improving Students' Learning by Developing their Understanding of Assessment Criteria and Processes by CHRIS RUST, MARGARET PRICE & BERRYO'DONOVAN Oxford Brookes University , Oxford, UK Published online.