ByPREFERENCES FOR CAR CHOICE IN UNITED STATES.docx

By
PREFERENCES FOR CAR CHOICE
IN UNITED STATES

Thank you
PREFERENCES FOR CAR CHOICE IN THE UNITED STATES
2
PREFERENCES FOR CAR CHOICE IN THE UNITED STATES
2

Table of Contents
Introduction…………………………………………………………
……………………………..3
Background3
Data Analysis4
Data Visualization9
Conclusion16
References17
Introduction
The most common applications of Statistics is describing a set
of descriptive data statistics, regression, and hypothesis testing
and inferential statistics. The two main branches are descriptive
and inferential statistics. People who do not have any formal
training in statistics are more familiar with inferential statistics
than with descriptive statistics. In this paper, the data will
analyze using descriptive statistics. So we will focus on the
descriptive branch of the statistics.
Descriptive Statistics Definition
The descriptive statistics are the type of statistical analysis that
helps to describe the data in some meaningful way. The
statistics are helpful to describe quantitatively about the
essential features of the data or information. The descriptive
statistics give the summaries of the given sample as well as the
observations done. These summaries or descriptions can either

be graphical or quantitative.Background
This study will focus on and analyzing & Visualizing the data
set about Preferences For Car Choice In The United States. The
data set contained 4654 observations and 71 columns. There are
several different types of graphs that help describe the
statistical data. These graphs are histogram, bar graph, box and
whisker plot, line graph, scatter plot, ogive, pie chart, and many
more. Generally, the kinds of measurements that can use with
descriptive statistics are:
The measure of central tendency describes the data which lies in
the center of a given frequency distribution. The main steps of
central tendency are mean and median and mode (Nick, 2020).
The spread measure describes how the scores are spread across
the entire distribution. In the spread, measurements that are
included standard deviation, variance, quartiles, range, absolute
difference.Data Analysis
One of the essential concepts of statistics is data analysis. It is
the process that is observing the data, analyzing, and modeling
the data. The purpose of data analysis is to obtain useful data
information and state conclusions which support decision-
making. The data analysis can be performed under several
techniques using different approaches. The method of data
assessment and analysis can be achieved by using analytical and
logical approaches to examine each component of the data
provided. Data from various sources are collected, reviewed,
and then explained for decision making or conclusions. There
are several methods for analyzing the results. Data mining, text
analytics, and business intelligence are some of the most
commonly used techniques and data visualizations.
The data analysis aims to collect raw data and convert it into
useful decision-making information. The various stages of
analysis of the data are as follows:
i) To make some type of sense out of each data collection
ii) To look for patterns and relationships both within a
collection and also across groups,
iii) To make general discoveries about the phenomena you are

researching
Before further analysis, I would like to create compactly display
the structure of the given dataset.
The below list describes the data contents:Descriptive summary
of the data set: using the r code function
Figure1.1 : Car Frame
Figure 1.2 : Price Range
Figure 1.3 : Pollution and Speed
Figure 1.4 : Pollution and Size
Figure 1.5: descriptive table for summary.data.frame(Car)
Table 1.1: Abstract table for price
Table 1.2: Abstract table for account
From the descriptive summary table, The minimum price is in
term of vehicle divided by the logarithm of income for price one
variable is 4.296, price three variable is 4.173 and for price five
variable is 4.150 I excluded price 2, price four and price
because they have the same mean and median to price 1, price
three and price five simultaneously. The ranges intern of
hundreds of miles vehicle can travel between
refueling/recharging. The mean value for range 1 is 160.49,
followed by range three is 240.38, and interval 5 is 312.03. Data
Visualization

Data visualization is the portrayal of data or data in a diagram,
outline, or other visual arrangements. It imparts connections to
the data with pictures. We need data visualization because a
visual outline of data makes it simpler to distinguish examples
and patterns than glancing through a large number of lines on a
spreadsheet. It is how the human cerebrum works. Since the
motivation behind data examination is to pick up experiences,
data is considerably more critical if we imagine. Regardless of
whether a data investigator can pull bits of knowledge from data
without Visualization, it will be progressively hard to convey
the significance without Visualization. Outlines and diagrams
make communicating data discoveries simpler regardless of
whether you can distinguish the examples without them
(Sheskin, 2017).
This is significant because it permits patterns and examples to
be all the more effectively observed. With the ascent of
enormous data upon us, we should have the option to decipher
progressively bigger bunches of data. AI makes it simpler to
lead investigations, for example, prescient examination, which
would then be able to fill in as supportive visualizations to
introduce.
Categorical variable Visualizing and Analyzing.
Figure 2.1: Choice of a vehicle among six propositions
From the pie chart, we can create a table for better
understanding.
Choice 5 is the highest percentage, followed by choice 3. While
choice 2 is the lowest number of choices.
Table 1.3: Choice of a vehicle among six propositions
Variables college education, size of household greater than 2,

and commute lower than 5 miles a day.
Here 0 represents No, and one represents Yes.
Figure 2.2: College
Figure 2.3: Households
Figure 2.4: column5
The below represent the summary of the three chart:
Table 1.3: column5
Variable types
Body type, one regular car, sport utility vehicle, sports car,
station wagon, truck, van, for each proposition z from 1 to 6.
Figure 2.5:Type 1 Figure 2.6:Type 2
Figure 2.7:Type 3 Figure 2.8:Type
4
Figure 2.9:Type 5
Figure 3.0:Type 6

The summary table of the type's variable is given below.
Table 1.4: Summary Variable
The most Preferences car is a regular car in the United States,
followed by a truck.
Figure 3.1:Type Fuel 1 Figure 3.2:Type Fuel 2
Figure 3.3:Type Fuel 3 Figure 3.4:Type Fuel 4
Figure 3.5:Type Fuel 5
Figure 3.6:Type Fuel 6
The summary of the fuel variable is given in the table retrieved
from the charts.
Table 1.5: Summary Variable
CNG is the most common fuel, and while gasoline is the least
common fuel. Variable acceleration, tens of seconds required to
reach 30 mph from stop and speeds highest attainable speed in
hundreds of mph.

Figure 3.7: Car Data Figure
3.8:Car speed
Figure 3.9:Car vs speed
From the summary table, we can conclude that.
Table 1.6: Summary Pollution
Sizes: 0 for a mini, 1 for a subcompact, 2 for a compact, and 3
for a mid-size or large vehicle.
Figure 4.0:Car vs speed
A bar chart shows the relations between discrete categories.
One axis of the graph represents the individual groups being
compared, and the other axis indicates a calculated value, the
diagram is shown above informs us that the most preferred
configuration is a mid-size or large vehicle for the variable size.
In contrast, the least preference is the mini size.
Space: Fraction of luggage space in a comparable new gas
vehicle.
Table 1.7: Luggage space
Costs: cost per mile of travel (tens of cents): home recharging
for an electric vehicle, station refueling otherwise

Stations: A fraction of stations that can refuel/recharge the
vehicle
Table 1.8: Station refuel or recharge
A scatter plot, or scatter graph, is a visual representation of two
variables (Cost and Speed) in a set of data. The plot represents
using Cartesian coordinates with the independent variable x
(speed) on the horizontal axis and the dependent variable y
(cost) on the vertical axis. From the scatter plot, there is a weak
positive relationship exist between cost and speed. The
correlation coefficient ® measures the linear relationship
between two variables, with a value range of -1 to 1. The
correlation coefficient ® between cost and speed is 0.145011
shows that there is a weak positive relationship exist between
cost and speed.
Conclusion
Based on the analysis, we can conclude that the minimum price
in terms of the vehicle divided by the income logarithm for the

price 1 variable is 4,296, the price 3 variable is 4,173, and the
price 5 variable is 4,150. We excluded price 2, price 4, and
price because they have the same mean and mean as price 1,
price 3, and price 5 at the same time.
The most preferred choice is choice5, and the least option is
choice2, there are 23% of respondents are college not educated
while 77% are college-educated. 22% of respondents sizes of
households are more significant than 2, and 78% size of
household families is smaller than 2. In the sample data, 36%
commute shorter than 5 miles a day, while 64% are commute
higher than 5 miles a day. The preferable vehicle is a regular
car, and the preferred fuel is CNG, and the least chosen fuel is
gasoline. The correlation coefficient (r) between cost and speed
is 0.145011 shows that there is a weak positive relationship
exist between cost and speed.
References
Reid, H. (2013, August). Introduction to Statistics. SAGE
Publication.
Jackson, S. L. (2017). Statistics plain and simple. Boston, MA:
Cengage Learning
Alan, J. (2018). Ohio touts successes against human trafficking.
Ohio: The Columbus Dispatch.
Erik, M. (2017). Regression Analysis. Market Research, 12(7),
31.
Fishe, R. (2016). the social relationship between the teenager's
psychological changes and physiological changes. Journal of
medical statistics, 11(2), 32.

Sheskin, D. J. (2017). Handbook of parametric and
nonparametric statistical procedures. New York: CRC Press.
Jackson, S. (2017). Statistics plain and simple. Cengage
Learning. Retrieved from
phoenix.vitalsource.com/#/books/9781337681728/cfi/6/8!/4/[em
ail protected]:5.88
price1price3price5range1range3range5
Minimum0.5987260.5987260.6351965075250
Mean4.2962624.1732414.149952160.4856240.3792312.0327
Median4.1386844.0395754.039575125250300
Maximum17.3705617.3705617.37056300400400
acc1acc3acc5speed1speed3speed5
Minimum2.52.52.5558585
Mean4.172544.2729914.05446984.66695107.3055107.3421
Median444859595
Maximum666140140140
choiceCountPercent
choice188719%
choice22696%
choice3134529%
choice43497%
choice5149932%
choice63057%
CountPercentCountPercentCountPercent
choice188719%0107923%0298964%
choice22696%1357577%1166536%
choice3134529%
choice43497%CountPercent
choice5149932%0362178%
choice63057%1103322%
choicecollege
hsg2
coml5
type1type2type3type4type5type6Total

van41092841018624109724992
regcar31387693138362313838510930
truck4871851487117548711415628
sportuv28335283572831071048
stwagon137991137112413719204446
sportcar1998019974199129880
Total46544654465446544654465427924
fuel1fuel2fuel3fuel4fuel5fuel6
cng1178117823302330--
methanol34763476----
electric--2324232411751175
gasoline----34793479
pollution1pollution2pollution3pollution4pollution5pollution6
Mean0.08530.08530.41370.41370.59410.5941
Median000.40.40.60.6
Mode000.40.40.250.25
Minimum000.10.10.250.25
Maximum0.60.60.750.7511
space1space2space3space4space5space6
Mean0.8507740.8507740.9256770.92567711
Median111111
Minimum0.70.70.70.711
Maximum111111
station1station2station3station4station5station6
Mean0.0895140.0895140.3827680.3827680.8239150.823915
Median000.30.311
Minimum000.10.10.10.1
Maximum0.70.70.70.711

ByPREFERENCES FOR CAR CHOICE IN UNITED STATES.docx

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie ByPREFERENCES FOR CAR CHOICE IN UNITED STATES.docx

Ähnlich wie ByPREFERENCES FOR CAR CHOICE IN UNITED STATES.docx (11)

Mehr von clairbycraft

Mehr von clairbycraft (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

ByPREFERENCES FOR CAR CHOICE IN UNITED STATES.docx