This document describes a study on a real-time guidance camera interface to enhance photo aesthetic quality for novice photographers. The system uses computer vision techniques to detect the region of interest in a photo, assess how well it follows the rule of thirds composition guideline, and provides real-time feedback to users. A user study found that photos taken with the guidance interface were rated significantly higher in aesthetic quality by experts and workers than a static gridline interface and better conformed to the rule of thirds. The interface was effective in improving photo composition and quality for novice users.
Adaptive approach to retrieve image affected by impulse noise
Real-time Camera Interface Enhances Photo Aesthetics
1. Real-time Guidance Camera Interface to
Enhance Photo Aesthetic Quality
Yan Xu1, Joshua Ratcliff1, James Scovell2, Gheric Speiginer3, Ronald
Azuma1
1Intel Labs 2Intel Corporation 3Georgia Institute of Technology
4. Real-time Guidance for Novice Users
4
A Camera Interface that can
• Understand the scene
• Know the object of interest
• Give concrete guidance
Research question:
Is the real-time guidance interface an effective way to enhance
photos’ aesthetic quality?
5. Choosing One Photography Rule in One Photo
Scenario
• Rule-of-thirds
• 1) important compositional elements should be placed along these
lines or their intersection [1]
• 2) the proportion of the object of interest should be roughly one third of
the total image space [2]
• One person portraiture
6
[1] Peterson, B. F. (2003). Learning to see creatively, Amphoto Press.
[2] Smith, J. T. (1797). Remarks on rural scenery. Nathaniel Smith ancient Print. Cited by the Wikipedia
page about rule of thirds: http://en.wikipedia.org/wiki/Rule_of_thirds (retrieved September 10, 2014)
9. System Components
10
2. Calculate how much does the region of interest follow rule-of-thirds
The bitmap mask for calculating the alignment between
subject-of-interest and rule-of-thirds
12. Procedure
13
• 40 users take portraiture photos of their friend, using our interface
and a static grid interface
• 24 professional photographers rated the photos
• 48 Mechanical Turk raters rated the photos
14. Quantitative Results (1)
15
• Photos taken with real-time guidance UI has better aesthetic quality than static gridline
UI
• Using the two-factor repeated measures ANOVA, we found that expert photographers and
Mechanical Turk workers (MT) rated the photos taken by real-time guidance interface to be
significantly better than those taken by static gridline interface (expert: F = 7.62, p < .05, η2
partial =
.249); MT: F = 20.41, p < .01, η2
partial = .303).
Raters
Real-time Guidance
UI
Static Gridline UI
Expert
photographers
M = 43.95
SD = 22.99
M = 40.60
SD = 22.08
Mechanical Turk
workers
M = 65.98
SD = 21.87
M = 60.92
SD = 22.42
15. Quantitative Results (2)
16
• Users follow rule-of-thirds better when they use real-time guidance
interface
• Users align the subject to the rule-of-thirds grid better with the RG interface
than the SG interface (average diff = 31.76 (on a 0-250 scale), p < .05, one
tailed paired t-test)
• The proportion of human subject’s width is significantly closer to 1/3 when
using the RG interface compared to the SG interface (average difference =
6%, p < .01, one tailed paired t-test). Users tend to have smaller
subject/image ratios when using the SG interface
16. Qualitative Findings - Experts
17
1. FACE: Genuine/natural smile, emotion, eye
contact, glass reflection, teeth, skin tone,
2. BODY: Aliveness, Natural poses, hands,
sense of movement and action, head/body
proportion
4. BACKGROUND: leading lines (and other
prominent lines) and vanishing points,
distraction, complement the subject or not.
3. AROUND THE SUBJECT: subject saliency,
distracting background right next to the subject
5. BIG PICTURE: Composition (balance, camera
angle/distance , rule of thirds), Lighting (exposure,
evenness, shadows), Color (white balance,
saturation, Camera angle
50%10%0% 20% 30% 40%
17. Qualitative Findings – Mechanical Turk Raters
18
1. FACE: Smile, facial expression (naturalness,
confidence, attractiveness), teeth, skin, hair,
mood(more fun, happier, more relaxed), eye
contact
2. BODY: Natural poses, ore flattering body
shape; less distraction or cut off by other
elements; pose, clothes
5. BIG PICTURE: Lighting and shadows, Color
(tone, accuracy, saturation, glare, patches,
vividness, true to reality, harmony), composition,
camera angle and distance, sharpness
4. BACKGROUND: less distraction, realistic or
not, broader view and context, leading lines
3. AROUND THE SUBJECT: subject saliency,
distracting background right next to the
subject, fg/bg harmony
50%10%0% 20% 30% 40%
19. Take-away Message
20
• Real-time guidance interface is effective in terms of improving photo
aesthetic quality and user’s conformation to photography rules
• Understanding photos in both RGB and depth can help us better
evaluate photo quality and provide feedback
20. New Capabilities of Understanding Photos
21
• Depth
• Segmentation
• Geometry
• Lighting
…
21. Thank you!
Real-time Guidance Camera Interface to
Enhance Photo Aesthetic Quality
Yan Xu, Joshua Ratcliff, James Scovell, Gheric Speiginer , Ronald Azuma
contact: yan.xu@intel.com