SlideShare a Scribd company logo
1 of 5
Page 1
U.S. Food and Drug Administration
Office of the Commissioner/Economics Staff
Internship Reflection – BSOS386
Upon the completion of my previous internship, I was excited to acquire greater insight
into the microeconomic analysis conducted at the federal level. Within the FDA Commissioner's
office lies a group of about 35 regulatory economists, who I had the pleasure of working with
throughout the first half of my senior year. This is the largest staff I have been a part of to date,
which proved both beneficial and challenging.
Excited to take in as much as I could in the little time I had, I hit the ground running. The
Economics Staff had recently tapped into the vast network of ScanTrack data provided through
the agency's subscription to AC Nielsen. Nielsen's ScanTrack data contains records of
grocery/convenience store goods purchased from 2002-present, through UPCs. The data was
broken down into hundreds of categorical variables with multi-thousand line spreadsheets for
each major department and subcategory. Having worked with similar size data at my other two
internships, I was prepared to use many of the same navigational and analytical tactics to help
me transition from corporate taxes into something more tangible like bakery products.
The FDA had recently initiated a 'Sodium Reduction Initiative' to combat the rising
consumption of processed foods; which are notorious for their high preservative and sodium
content. Excessive sodium intake can pose severe long-term health risks, with the potential to
develop more adverse spillover effects. The agency set an initial target level for daily
consumption that has since been revisited for stricter regulation policy proposals. The Economics
Staff was tasked with quantifying a new target, and sought to investigate previous and current
consumer trends in Nielsen’s ScanTrack data.
ScanTrack data is provided periodically in ‘waves’ to provide subscribers with the most
up to date sales numbers. Two senior economists had recently begun the initial data sourcing and
analysis, running the most up to date Wave 5 data through SAS. My first project was to audit
Page 2
their reconfiguration and identify the discrepancies between the inputs and outputs. In the ‘Raw’
ScanTrack data, the would-be production description column included the name and a respective
size and measuring unit; a possible root cause for post-SAS misalignment. My first instinct was
to separate the two through Stata, before realizing that there had to be a simpler method to create
the two columns in Excel. Sure enough, with a little troubleshooting I was able to parse the two
through the ‘Convert Text to Columns’ tab. Within the conversion window, I set the parsed
column destinations using a fixed width break line to designate each to a new ‘Product
Description’ and ‘Size’ category. This would be useful to apply to the hundreds of ScanTrack
spreadsheets, but for the nature of this project I only reformatted ‘Wave 5 – Part 1’; passing on
my suggestion to my associates.
One of the most noticeable issues was that the SAS transformation eliminated the product
descriptions, making it very difficult to cross-reference the various reference variables between
input and output spreadsheets. My best bet was to align the spreadsheets together by UPCs, a
reference variable that would identify an exact good. After adding a few filters to the primary
columns, I was successfully able to line up the pre-SAS and post-SAS spreadsheets. Later, I
filtered Wave 5 even further by adding two more tiers for month and year; transcribing the
months into their respective number values. Now that this discrepancy was sorted out (no pun
intended), I could revisit the key ‘Average Unit Price’ and ‘Sales Units’ variables. Using Excel’s
IF function, I created a mass applicator output column to identify which goods had discrepancies
after the SAS transformation. The column would return ‘MATCH’ or ‘NO MATCH’ coded in
green and red respectively. I used the same thorough approach to cross-reference the ~30 vitamin
and nutrient variables associated with these goods; using a series of filters on stacked datasets to
expedite my assignment. I was later asked to condense the data even further in a separate
spreadsheet, eliminating the duplicate products. Referencing a previously utilized Do-File from
my time at the NJ Treasury, I rewrote the Stata code for eliminating duplicates for this new
spreadsheet. The statistical significance of my associates results relied primarily on the quality of
data used; a mathematical pitfall I was intent to fill. While I was unable to rewrite the SAS code
to fix these issues, I was very happy to contribute my time to my associates’ new project and
working with Nielsen’s highly explorable datasets.
The first couple months of my internship were spent getting to know the layout of
Nielsen’s datasets and familiarizing myself with their project plan; and so, my thoroughness
aimed to build upon my Excel skills while achieving the most accurate results for my colleagues.
I also worked simultaneously on a couple of other projects to learn more about the research being
conducted in other industries. One economist was working on a regulation draft for sunscreen
and cosmetics which had a few missing components to it. The data my colleague was using was
missing brand and manufacturing/distributor names for ~4,000/11,000 products. This seemed
daunting at first glance, but thankfully it wasn’t the first time I had compiled relatively
nonexistent data. Using DailyMed, I could simply plug in the FDA provided NDC codes
associated with each product to return the requested brand name. This went smoothly at first
until I came across some foreign makeup brands that weren’t registered in DailyMed and had
drug labels that were in another language. These products were frustrating, but I found that these
companies had other goods in this spreadsheet with alternate English names on them. Now
Page 3
comprehendible, I could search for these companies for their legal names and parent companies
where applicable. Using DailyMed for this project was a great way to get acquainted with the
cosmetic industry’s biggest manufacturers and the various regulatory speedbumps they
encounter.
Shortly after the completion of my cosmetics project, I sought out the lone economist on
staff that works with pharmaceuticals, medical devices, and biotechnology related policies (as
appetizing as food safety was). She had been meaning to start a research study to see how long it
takes to develop a drug, with regards to molecular and target development; and how these
processes have changed over time under various circumstances. I had never had the opportunity
to lay the groundwork for a research project, and was determined to compile as much as I could.
With my FDA email, I could access a database called Citeline, a.k.a. Pharmaprojects, which
compiled clinical drug development lifecycle events and statistics. Without a chemistry/biology
background it was difficult to understand what many of these drug components were, but from
an economic perspective these commercial codes and targets were just like any other variables. I
have generally found it easier to work with tangible data, but have anxiously sought an
opportunity to work with new data from an entirely foreign industry such as pharmaceuticals.
My associate had provided me with an initial spreadsheet containing ~70 targeted drugs’ oldest
active ingredient, initial public introduction, and manufacturer; delegating me to find their
trade/commercial name, drug code, and targets.
After exploring the database’s interface, I could properly identify the necessary
components asked of me. It took me a few entries to figure out that Pharmaprojects could hold
more than one drug input per search entry, which eliminated the unnecessary repetitive step of
searching one at a time. The individual drug development timeline pages wouldn’t let me copy
and paste the components I needed, which was especially inconvenient when it came to
irregularly complex targets such as Phosphoribosylglycinamide Formyltransferase. Lastly, I
cross-referenced the drug codes in PubMed to see when they were initially introduced to the
public, pointing out any inconsistencies between them and FDA records. While compiling these
metrics, I also completed some basic research on the firms that produced these drugs. Many of
the big drug producers like Bristol-Myers Squibb, GlaxoSmithKline, Merck, Eli Lilly, etc., had
previously acquired other firms, and so I noted these M&As where applicable. The
Pharmaprojects database contained other very interesting statistics and facts that I’m not allowed
to disclose, but were appealing to read about. My associate was very pleased with my work to
get this going, after previous postponements, but was unfortunately unable to revisit the project’s
next steps before my departure. I look forward to following her ongoing research.
In the final weeks of my internship, I began working on a larger project with another
senior economist who I had completed smaller tasks for earlier in the semester. This economist
had recently begun working with recalled products through a series of Recall Enterprise System
(RES) and Reportable Food Registry (RFR) datasets to estimate the risk associated with
distributing non-compliant goods and the potential deadweight loss imposed on the food market
because of them. We were most interested in FY2013, as it had the most abundant and diverse
set of goods. The RES/RFR datasets gathered some basic product characteristics such as its
Page 4
manufacturer, recalled date, primary/secondary agents, etc., but lacked respective prices.
Immediately, I suggested using the raw Nielsen spreadsheets to seek out a price through their
data, thinking that there had to be one or more of these products recorded. Unfortunately, the raw
Nielsen data lists truncated product names, making finding the exact product very difficult. We
later decided that it would make the most sense to find a respective match to the RES/RFR
products, as it would return similar results.
I added another tab to our initial spreadsheet to line up the commodities and their
reference categories, with an additional column labeled ‘Nielsen Match’ for the truncated
Nielsen products. In case I had to return to one of the hundred thousand Nielsen spreadsheets, I
added another few columns for UPC, department, and category; before later adding the key
columns: Total Sales Revenue (2013), Total Equivalent Units, Price per 16 Ounces, Serving
Size, Grams per (Price Per 16 Ounces), and the weekly totals for revenues and quantities.
Searching through the Nielsen departments was very tedious, but by using a filter on several key
variables such as ‘Flavor’, I could find them relatively quickly. I set up a SUM function to tally
the aggregate totals across annual revenue and units sold for each good; this way all I had to do
was copy and paste over from Nielsen, and Excel would keep a running total for the match as
well as the entire recall spreadsheet for both variables. I used the same methodology for the
column ‘Price per 16 Ounces’, the equivalized based used by Nielsen to measure goods that
normally would be incomparable (e.g. fajita mix vs. assorted cookies). Using the ‘/’ division
symbol and mass applicator, I easily divided ‘Total Sales Revenue (2013)’ by ‘Total Equivalent
Units’ to arrive at the ‘Price per 16 Ounces’; giving us a rough estimate of what that particular
good would cost given the current market value, approximated by the sales volume referenced
through Nielsen. Next I compiled the serving sizes for each using the FDA’s Serving Size Final
Rule, while also converting the respective total volume serving sizes to grams. The last
component I would add to my associate’s report was the reference amount customarily
consumed, an externally determined consumption reference amount enforced through the FDA’s
Serving Size Final Rule. Now that everything was in place, I could crunch the totals and
determine one of my associate’s report’s key metrics: Mean Retail Sales Dollars per Servings.
I have provided a sample row with slightly altered figures to depict my findings for my
recall project.
The ‘Total Equivalent Units’ are the total units sold of this particular good in FY2013;
Retail Sales Dollars ($) is the total revenue accrued from annual sales of the given good;
Equivalized Base (oz.) is the 16-ounce reference metric used to control for the variation in food
measurements; Total Number (#) of Retail Ounces (oz.) is the sum of the Total Equivalent Units
(16 oz.) times the Equivalized Base (oz.); The Total Number (#) of Retail Grams (g) is the Total
Number (#) of Retail Ounces (oz.) multiplied by 28.3495, the conversion rate from ounces to
grams; Total Goods in Servings is the Total Number (#) of Retail Ounces (oz.) times the 28.3495
Total Equivalent
Units (16 oz.)
Retail Sales Dollars ($)
Equivalized
Base (oz.)
Total Number
(#) of Retail
Ounces (oz.)
Total Number (#) of
Retail Grams (g)
Total Goods in
Servings
(servings)
Mean Retail
Sales Dollars ($)
per Serving
(servings)
4,956,992,908 9,162,671,802.18$ 16 79,311,886,528 2,248,452,327,126 44,969,046,543 $0.20
Page 5
conversion rate divided by the reference amount customarily consumed (in this case 50); and
finally the Mean Retail Sales Dollars ($) per Serving (Servings) is the Retail Sales Dollars ($)
divided by the Total Goods in Servings (servings). In this example, the cost, defined by the lost
revenue incurred by the manufacturer, can be approximated as $0.20 per serving. These numbers
again are altered for my internship review, and do not depict statistically significant findings or
represent approved/finalized FDA figures.
This example was sourced from a similar evaluation that measured an entire category of
food (e.g. bread) and depicts a much larger result, representative of the countless firms that
produce bread. My project conversely analyzed a group of select recalled products that spanned
across multiple food categories. The numbers I discovered would appropriately be much less per
serving than the one described above. This is one of many statistics that will be used by my
associate in his final public health risk estimation once he compiles the remaining ones and runs
it through @RISK. This project was very data intensive, providing me with another opportunity
to strengthen my Excel skills and gain some more experience running some of the basic Stata
syntaxes to help clean my datasets. While I was working on my recall and pharmaceutical
projects, I also began to teach myself SAS programming through one of the FDA’s many
software subscriptions. I usually utilized these training modules in my spare time and in the
occasional break in the action at work to strengthen my statistical toolkit and gain a better
understanding of computer programming. The syntaxes and interface are a little different than
Stata and SPSS but certainly nothing overwhelming. I was very fortunate to have access to a
high demand skill program, not taught at UMD, that could help me familiarize myself with new
analytical methods to supplement my theoretical coursework. I am anxious to continue teaching
myself SAS over the next few years to help improve my software versatility.
Overall, I am happy to reflect on this experience as a great microeconomic opportunity to
gain specialized insight into health economics and government regulation. This time there was
no other intern to corroborate with, and so my ability to communicate across a relatively large
and sporadic department was certainly tested. Despite said conditions, everyone on staff was
very helpful and accommodating during my tenure, allowing me to get as much out of these
short few months as I could. While I will miss my fellow economists at the FDA dearly, I am
excited to return to Washington, D.C. in my final semester. I will conclude my undergraduate
internship experience at the executive agency level, working alongside macroeconomists in the
U.S. Department of the Treasury’s Office of International Affairs.

More Related Content

Viewers also liked

SMRT Internship Sharing_V2
SMRT Internship Sharing_V2SMRT Internship Sharing_V2
SMRT Internship Sharing_V2Ethan Chia
 
JoannaBergerMScDissertation
JoannaBergerMScDissertationJoannaBergerMScDissertation
JoannaBergerMScDissertationJoanna Berger
 
Rolling stock by manufacturer CSR rolling stock corporation limited china
Rolling stock by manufacturer CSR rolling stock corporation limited chinaRolling stock by manufacturer CSR rolling stock corporation limited china
Rolling stock by manufacturer CSR rolling stock corporation limited chinaVoice Malaysia
 
Pdf buoi1 2-on-page-tran-ngoc-chinh-mastercode.vn
Pdf buoi1 2-on-page-tran-ngoc-chinh-mastercode.vnPdf buoi1 2-on-page-tran-ngoc-chinh-mastercode.vn
Pdf buoi1 2-on-page-tran-ngoc-chinh-mastercode.vnMasterCode.vn
 
Smart Lock for Password @ Game DevFest Bangkok 2015
Smart Lock for Password @ Game DevFest Bangkok 2015Smart Lock for Password @ Game DevFest Bangkok 2015
Smart Lock for Password @ Game DevFest Bangkok 2015Somkiat Khitwongwattana
 
Pd fbuoi7 8--tongquanseo-mastercode.vn
Pd fbuoi7 8--tongquanseo-mastercode.vnPd fbuoi7 8--tongquanseo-mastercode.vn
Pd fbuoi7 8--tongquanseo-mastercode.vnMasterCode.vn
 
online hotel management system
online hotel management system online hotel management system
online hotel management system ANSHUL GUPTA
 
Blending with deltav
Blending with deltavBlending with deltav
Blending with deltavLuis Atencio
 
Summit72_extremedreams
Summit72_extremedreamsSummit72_extremedreams
Summit72_extremedreamsAsh Fusiarski
 

Viewers also liked (13)

SMRT Internship Sharing_V2
SMRT Internship Sharing_V2SMRT Internship Sharing_V2
SMRT Internship Sharing_V2
 
JoannaBergerMScDissertation
JoannaBergerMScDissertationJoannaBergerMScDissertation
JoannaBergerMScDissertation
 
Ppt0000003
Ppt0000003Ppt0000003
Ppt0000003
 
Eastern Bloc vehicles
Eastern Bloc vehiclesEastern Bloc vehicles
Eastern Bloc vehicles
 
Rolling stock by manufacturer CSR rolling stock corporation limited china
Rolling stock by manufacturer CSR rolling stock corporation limited chinaRolling stock by manufacturer CSR rolling stock corporation limited china
Rolling stock by manufacturer CSR rolling stock corporation limited china
 
Pdf buoi1 2-on-page-tran-ngoc-chinh-mastercode.vn
Pdf buoi1 2-on-page-tran-ngoc-chinh-mastercode.vnPdf buoi1 2-on-page-tran-ngoc-chinh-mastercode.vn
Pdf buoi1 2-on-page-tran-ngoc-chinh-mastercode.vn
 
Smart Lock for Password @ Game DevFest Bangkok 2015
Smart Lock for Password @ Game DevFest Bangkok 2015Smart Lock for Password @ Game DevFest Bangkok 2015
Smart Lock for Password @ Game DevFest Bangkok 2015
 
Pokemon GO 101@Nextzy
Pokemon GO 101@NextzyPokemon GO 101@Nextzy
Pokemon GO 101@Nextzy
 
Pd fbuoi7 8--tongquanseo-mastercode.vn
Pd fbuoi7 8--tongquanseo-mastercode.vnPd fbuoi7 8--tongquanseo-mastercode.vn
Pd fbuoi7 8--tongquanseo-mastercode.vn
 
online hotel management system
online hotel management system online hotel management system
online hotel management system
 
Blending with deltav
Blending with deltavBlending with deltav
Blending with deltav
 
Summit72_extremedreams
Summit72_extremedreamsSummit72_extremedreams
Summit72_extremedreams
 
US9314934B2
US9314934B2US9314934B2
US9314934B2
 

Similar to FDA Internship Review

NEW PRODUCT DEVELOPMENT FOR THE COMPANY DELL INC.ANALYZING THE.docx
NEW PRODUCT DEVELOPMENT FOR THE COMPANY DELL INC.ANALYZING THE.docxNEW PRODUCT DEVELOPMENT FOR THE COMPANY DELL INC.ANALYZING THE.docx
NEW PRODUCT DEVELOPMENT FOR THE COMPANY DELL INC.ANALYZING THE.docxhenrymartin15260
 
2 Chapter 1 The Where, Why, and How of Data Collection Busines.docx
2 Chapter 1  The Where, Why, and How of Data Collection Busines.docx2 Chapter 1  The Where, Why, and How of Data Collection Busines.docx
2 Chapter 1 The Where, Why, and How of Data Collection Busines.docxlorainedeserre
 
Kaur_S_TermProject
Kaur_S_TermProjectKaur_S_TermProject
Kaur_S_TermProjectSupan Kaur
 
Ehr presentation script for blog
Ehr presentation script for blogEhr presentation script for blog
Ehr presentation script for blogMargaret Henderson
 
Five creative search solutions using text analytics
Five creative search solutions using text analyticsFive creative search solutions using text analytics
Five creative search solutions using text analyticsEnterprise Knowledge
 
BUSI 331Marketing Research Report Part 4 Instructions.docx
BUSI 331Marketing Research Report Part 4 Instructions.docxBUSI 331Marketing Research Report Part 4 Instructions.docx
BUSI 331Marketing Research Report Part 4 Instructions.docxhumphrieskalyn
 
Business Analytics
Business AnalyticsBusiness Analytics
Business AnalyticsBarry Shore
 
Component inventory reduction
Component inventory reductionComponent inventory reduction
Component inventory reductionTristan Wiggill
 
DB_REPORT_final_revised
DB_REPORT_final_revisedDB_REPORT_final_revised
DB_REPORT_final_revisedRohan Singla
 
Pharmaceutical Serialization Strategic Planning Guide
Pharmaceutical Serialization Strategic Planning GuidePharmaceutical Serialization Strategic Planning Guide
Pharmaceutical Serialization Strategic Planning GuideMichael Stewart
 
Submit your paper in the following order format.I. Author a bi.docx
Submit your paper in the following order  format.I. Author a bi.docxSubmit your paper in the following order  format.I. Author a bi.docx
Submit your paper in the following order format.I. Author a bi.docxdeanmtaylor1545
 
in the world of data analytics, there is a multitude of visualizat
in the world of data analytics, there is a multitude of visualizatin the world of data analytics, there is a multitude of visualizat
in the world of data analytics, there is a multitude of visualizatLizbethQuinonez813
 
Internship at the Bureau of Labor Statistics by Ashley El Rady
Internship at the Bureau of Labor Statistics by Ashley El RadyInternship at the Bureau of Labor Statistics by Ashley El Rady
Internship at the Bureau of Labor Statistics by Ashley El RadyBrown Fellows Program
 

Similar to FDA Internship Review (20)

NEW PRODUCT DEVELOPMENT FOR THE COMPANY DELL INC.ANALYZING THE.docx
NEW PRODUCT DEVELOPMENT FOR THE COMPANY DELL INC.ANALYZING THE.docxNEW PRODUCT DEVELOPMENT FOR THE COMPANY DELL INC.ANALYZING THE.docx
NEW PRODUCT DEVELOPMENT FOR THE COMPANY DELL INC.ANALYZING THE.docx
 
2 Chapter 1 The Where, Why, and How of Data Collection Busines.docx
2 Chapter 1  The Where, Why, and How of Data Collection Busines.docx2 Chapter 1  The Where, Why, and How of Data Collection Busines.docx
2 Chapter 1 The Where, Why, and How of Data Collection Busines.docx
 
Relational Database
Relational DatabaseRelational Database
Relational Database
 
Kaur_S_TermProject
Kaur_S_TermProjectKaur_S_TermProject
Kaur_S_TermProject
 
Ehr presentation script for blog
Ehr presentation script for blogEhr presentation script for blog
Ehr presentation script for blog
 
mostrecentresume
mostrecentresumemostrecentresume
mostrecentresume
 
mostrecentresume
mostrecentresumemostrecentresume
mostrecentresume
 
Five creative search solutions using text analytics
Five creative search solutions using text analyticsFive creative search solutions using text analytics
Five creative search solutions using text analytics
 
BUSI 331Marketing Research Report Part 4 Instructions.docx
BUSI 331Marketing Research Report Part 4 Instructions.docxBUSI 331Marketing Research Report Part 4 Instructions.docx
BUSI 331Marketing Research Report Part 4 Instructions.docx
 
Business Analytics
Business AnalyticsBusiness Analytics
Business Analytics
 
Component inventory reduction
Component inventory reductionComponent inventory reduction
Component inventory reduction
 
DB_REPORT_final_revised
DB_REPORT_final_revisedDB_REPORT_final_revised
DB_REPORT_final_revised
 
Pharmaceutical Serialization Strategic Planning Guide
Pharmaceutical Serialization Strategic Planning GuidePharmaceutical Serialization Strategic Planning Guide
Pharmaceutical Serialization Strategic Planning Guide
 
Submit your paper in the following order format.I. Author a bi.docx
Submit your paper in the following order  format.I. Author a bi.docxSubmit your paper in the following order  format.I. Author a bi.docx
Submit your paper in the following order format.I. Author a bi.docx
 
Hh
HhHh
Hh
 
in the world of data analytics, there is a multitude of visualizat
in the world of data analytics, there is a multitude of visualizatin the world of data analytics, there is a multitude of visualizat
in the world of data analytics, there is a multitude of visualizat
 
assesment 6
assesment 6assesment 6
assesment 6
 
Data analytics
Data analyticsData analytics
Data analytics
 
Internship at the Bureau of Labor Statistics by Ashley El Rady
Internship at the Bureau of Labor Statistics by Ashley El RadyInternship at the Bureau of Labor Statistics by Ashley El Rady
Internship at the Bureau of Labor Statistics by Ashley El Rady
 
Mighty Guides- Data Disruption
Mighty Guides- Data DisruptionMighty Guides- Data Disruption
Mighty Guides- Data Disruption
 

More from Daniel Gardner

More from Daniel Gardner (6)

ECON490 Assignment 3
ECON490 Assignment 3ECON490 Assignment 3
ECON490 Assignment 3
 
HCaTS Blog
HCaTS BlogHCaTS Blog
HCaTS Blog
 
NCAGE and SAM Blog
NCAGE and SAM BlogNCAGE and SAM Blog
NCAGE and SAM Blog
 
FAR 'Final Rule' Blog
FAR 'Final Rule' BlogFAR 'Final Rule' Blog
FAR 'Final Rule' Blog
 
Project Essay
Project EssayProject Essay
Project Essay
 
Assignment 1 GVPT170
Assignment 1 GVPT170Assignment 1 GVPT170
Assignment 1 GVPT170
 

FDA Internship Review

  • 1. Page 1 U.S. Food and Drug Administration Office of the Commissioner/Economics Staff Internship Reflection – BSOS386 Upon the completion of my previous internship, I was excited to acquire greater insight into the microeconomic analysis conducted at the federal level. Within the FDA Commissioner's office lies a group of about 35 regulatory economists, who I had the pleasure of working with throughout the first half of my senior year. This is the largest staff I have been a part of to date, which proved both beneficial and challenging. Excited to take in as much as I could in the little time I had, I hit the ground running. The Economics Staff had recently tapped into the vast network of ScanTrack data provided through the agency's subscription to AC Nielsen. Nielsen's ScanTrack data contains records of grocery/convenience store goods purchased from 2002-present, through UPCs. The data was broken down into hundreds of categorical variables with multi-thousand line spreadsheets for each major department and subcategory. Having worked with similar size data at my other two internships, I was prepared to use many of the same navigational and analytical tactics to help me transition from corporate taxes into something more tangible like bakery products. The FDA had recently initiated a 'Sodium Reduction Initiative' to combat the rising consumption of processed foods; which are notorious for their high preservative and sodium content. Excessive sodium intake can pose severe long-term health risks, with the potential to develop more adverse spillover effects. The agency set an initial target level for daily consumption that has since been revisited for stricter regulation policy proposals. The Economics Staff was tasked with quantifying a new target, and sought to investigate previous and current consumer trends in Nielsen’s ScanTrack data. ScanTrack data is provided periodically in ‘waves’ to provide subscribers with the most up to date sales numbers. Two senior economists had recently begun the initial data sourcing and analysis, running the most up to date Wave 5 data through SAS. My first project was to audit
  • 2. Page 2 their reconfiguration and identify the discrepancies between the inputs and outputs. In the ‘Raw’ ScanTrack data, the would-be production description column included the name and a respective size and measuring unit; a possible root cause for post-SAS misalignment. My first instinct was to separate the two through Stata, before realizing that there had to be a simpler method to create the two columns in Excel. Sure enough, with a little troubleshooting I was able to parse the two through the ‘Convert Text to Columns’ tab. Within the conversion window, I set the parsed column destinations using a fixed width break line to designate each to a new ‘Product Description’ and ‘Size’ category. This would be useful to apply to the hundreds of ScanTrack spreadsheets, but for the nature of this project I only reformatted ‘Wave 5 – Part 1’; passing on my suggestion to my associates. One of the most noticeable issues was that the SAS transformation eliminated the product descriptions, making it very difficult to cross-reference the various reference variables between input and output spreadsheets. My best bet was to align the spreadsheets together by UPCs, a reference variable that would identify an exact good. After adding a few filters to the primary columns, I was successfully able to line up the pre-SAS and post-SAS spreadsheets. Later, I filtered Wave 5 even further by adding two more tiers for month and year; transcribing the months into their respective number values. Now that this discrepancy was sorted out (no pun intended), I could revisit the key ‘Average Unit Price’ and ‘Sales Units’ variables. Using Excel’s IF function, I created a mass applicator output column to identify which goods had discrepancies after the SAS transformation. The column would return ‘MATCH’ or ‘NO MATCH’ coded in green and red respectively. I used the same thorough approach to cross-reference the ~30 vitamin and nutrient variables associated with these goods; using a series of filters on stacked datasets to expedite my assignment. I was later asked to condense the data even further in a separate spreadsheet, eliminating the duplicate products. Referencing a previously utilized Do-File from my time at the NJ Treasury, I rewrote the Stata code for eliminating duplicates for this new spreadsheet. The statistical significance of my associates results relied primarily on the quality of data used; a mathematical pitfall I was intent to fill. While I was unable to rewrite the SAS code to fix these issues, I was very happy to contribute my time to my associates’ new project and working with Nielsen’s highly explorable datasets. The first couple months of my internship were spent getting to know the layout of Nielsen’s datasets and familiarizing myself with their project plan; and so, my thoroughness aimed to build upon my Excel skills while achieving the most accurate results for my colleagues. I also worked simultaneously on a couple of other projects to learn more about the research being conducted in other industries. One economist was working on a regulation draft for sunscreen and cosmetics which had a few missing components to it. The data my colleague was using was missing brand and manufacturing/distributor names for ~4,000/11,000 products. This seemed daunting at first glance, but thankfully it wasn’t the first time I had compiled relatively nonexistent data. Using DailyMed, I could simply plug in the FDA provided NDC codes associated with each product to return the requested brand name. This went smoothly at first until I came across some foreign makeup brands that weren’t registered in DailyMed and had drug labels that were in another language. These products were frustrating, but I found that these companies had other goods in this spreadsheet with alternate English names on them. Now
  • 3. Page 3 comprehendible, I could search for these companies for their legal names and parent companies where applicable. Using DailyMed for this project was a great way to get acquainted with the cosmetic industry’s biggest manufacturers and the various regulatory speedbumps they encounter. Shortly after the completion of my cosmetics project, I sought out the lone economist on staff that works with pharmaceuticals, medical devices, and biotechnology related policies (as appetizing as food safety was). She had been meaning to start a research study to see how long it takes to develop a drug, with regards to molecular and target development; and how these processes have changed over time under various circumstances. I had never had the opportunity to lay the groundwork for a research project, and was determined to compile as much as I could. With my FDA email, I could access a database called Citeline, a.k.a. Pharmaprojects, which compiled clinical drug development lifecycle events and statistics. Without a chemistry/biology background it was difficult to understand what many of these drug components were, but from an economic perspective these commercial codes and targets were just like any other variables. I have generally found it easier to work with tangible data, but have anxiously sought an opportunity to work with new data from an entirely foreign industry such as pharmaceuticals. My associate had provided me with an initial spreadsheet containing ~70 targeted drugs’ oldest active ingredient, initial public introduction, and manufacturer; delegating me to find their trade/commercial name, drug code, and targets. After exploring the database’s interface, I could properly identify the necessary components asked of me. It took me a few entries to figure out that Pharmaprojects could hold more than one drug input per search entry, which eliminated the unnecessary repetitive step of searching one at a time. The individual drug development timeline pages wouldn’t let me copy and paste the components I needed, which was especially inconvenient when it came to irregularly complex targets such as Phosphoribosylglycinamide Formyltransferase. Lastly, I cross-referenced the drug codes in PubMed to see when they were initially introduced to the public, pointing out any inconsistencies between them and FDA records. While compiling these metrics, I also completed some basic research on the firms that produced these drugs. Many of the big drug producers like Bristol-Myers Squibb, GlaxoSmithKline, Merck, Eli Lilly, etc., had previously acquired other firms, and so I noted these M&As where applicable. The Pharmaprojects database contained other very interesting statistics and facts that I’m not allowed to disclose, but were appealing to read about. My associate was very pleased with my work to get this going, after previous postponements, but was unfortunately unable to revisit the project’s next steps before my departure. I look forward to following her ongoing research. In the final weeks of my internship, I began working on a larger project with another senior economist who I had completed smaller tasks for earlier in the semester. This economist had recently begun working with recalled products through a series of Recall Enterprise System (RES) and Reportable Food Registry (RFR) datasets to estimate the risk associated with distributing non-compliant goods and the potential deadweight loss imposed on the food market because of them. We were most interested in FY2013, as it had the most abundant and diverse set of goods. The RES/RFR datasets gathered some basic product characteristics such as its
  • 4. Page 4 manufacturer, recalled date, primary/secondary agents, etc., but lacked respective prices. Immediately, I suggested using the raw Nielsen spreadsheets to seek out a price through their data, thinking that there had to be one or more of these products recorded. Unfortunately, the raw Nielsen data lists truncated product names, making finding the exact product very difficult. We later decided that it would make the most sense to find a respective match to the RES/RFR products, as it would return similar results. I added another tab to our initial spreadsheet to line up the commodities and their reference categories, with an additional column labeled ‘Nielsen Match’ for the truncated Nielsen products. In case I had to return to one of the hundred thousand Nielsen spreadsheets, I added another few columns for UPC, department, and category; before later adding the key columns: Total Sales Revenue (2013), Total Equivalent Units, Price per 16 Ounces, Serving Size, Grams per (Price Per 16 Ounces), and the weekly totals for revenues and quantities. Searching through the Nielsen departments was very tedious, but by using a filter on several key variables such as ‘Flavor’, I could find them relatively quickly. I set up a SUM function to tally the aggregate totals across annual revenue and units sold for each good; this way all I had to do was copy and paste over from Nielsen, and Excel would keep a running total for the match as well as the entire recall spreadsheet for both variables. I used the same methodology for the column ‘Price per 16 Ounces’, the equivalized based used by Nielsen to measure goods that normally would be incomparable (e.g. fajita mix vs. assorted cookies). Using the ‘/’ division symbol and mass applicator, I easily divided ‘Total Sales Revenue (2013)’ by ‘Total Equivalent Units’ to arrive at the ‘Price per 16 Ounces’; giving us a rough estimate of what that particular good would cost given the current market value, approximated by the sales volume referenced through Nielsen. Next I compiled the serving sizes for each using the FDA’s Serving Size Final Rule, while also converting the respective total volume serving sizes to grams. The last component I would add to my associate’s report was the reference amount customarily consumed, an externally determined consumption reference amount enforced through the FDA’s Serving Size Final Rule. Now that everything was in place, I could crunch the totals and determine one of my associate’s report’s key metrics: Mean Retail Sales Dollars per Servings. I have provided a sample row with slightly altered figures to depict my findings for my recall project. The ‘Total Equivalent Units’ are the total units sold of this particular good in FY2013; Retail Sales Dollars ($) is the total revenue accrued from annual sales of the given good; Equivalized Base (oz.) is the 16-ounce reference metric used to control for the variation in food measurements; Total Number (#) of Retail Ounces (oz.) is the sum of the Total Equivalent Units (16 oz.) times the Equivalized Base (oz.); The Total Number (#) of Retail Grams (g) is the Total Number (#) of Retail Ounces (oz.) multiplied by 28.3495, the conversion rate from ounces to grams; Total Goods in Servings is the Total Number (#) of Retail Ounces (oz.) times the 28.3495 Total Equivalent Units (16 oz.) Retail Sales Dollars ($) Equivalized Base (oz.) Total Number (#) of Retail Ounces (oz.) Total Number (#) of Retail Grams (g) Total Goods in Servings (servings) Mean Retail Sales Dollars ($) per Serving (servings) 4,956,992,908 9,162,671,802.18$ 16 79,311,886,528 2,248,452,327,126 44,969,046,543 $0.20
  • 5. Page 5 conversion rate divided by the reference amount customarily consumed (in this case 50); and finally the Mean Retail Sales Dollars ($) per Serving (Servings) is the Retail Sales Dollars ($) divided by the Total Goods in Servings (servings). In this example, the cost, defined by the lost revenue incurred by the manufacturer, can be approximated as $0.20 per serving. These numbers again are altered for my internship review, and do not depict statistically significant findings or represent approved/finalized FDA figures. This example was sourced from a similar evaluation that measured an entire category of food (e.g. bread) and depicts a much larger result, representative of the countless firms that produce bread. My project conversely analyzed a group of select recalled products that spanned across multiple food categories. The numbers I discovered would appropriately be much less per serving than the one described above. This is one of many statistics that will be used by my associate in his final public health risk estimation once he compiles the remaining ones and runs it through @RISK. This project was very data intensive, providing me with another opportunity to strengthen my Excel skills and gain some more experience running some of the basic Stata syntaxes to help clean my datasets. While I was working on my recall and pharmaceutical projects, I also began to teach myself SAS programming through one of the FDA’s many software subscriptions. I usually utilized these training modules in my spare time and in the occasional break in the action at work to strengthen my statistical toolkit and gain a better understanding of computer programming. The syntaxes and interface are a little different than Stata and SPSS but certainly nothing overwhelming. I was very fortunate to have access to a high demand skill program, not taught at UMD, that could help me familiarize myself with new analytical methods to supplement my theoretical coursework. I am anxious to continue teaching myself SAS over the next few years to help improve my software versatility. Overall, I am happy to reflect on this experience as a great microeconomic opportunity to gain specialized insight into health economics and government regulation. This time there was no other intern to corroborate with, and so my ability to communicate across a relatively large and sporadic department was certainly tested. Despite said conditions, everyone on staff was very helpful and accommodating during my tenure, allowing me to get as much out of these short few months as I could. While I will miss my fellow economists at the FDA dearly, I am excited to return to Washington, D.C. in my final semester. I will conclude my undergraduate internship experience at the executive agency level, working alongside macroeconomists in the U.S. Department of the Treasury’s Office of International Affairs.