This document presents a planning and acting system for the Angry Birds video game that uses refinement methods. It designs a Refinement Acting Engine (RAE) based on the ARP-interleave framework with a Sequential Refinement Planning Engine (SeRPE). The system incorporates greedy, depth-first search, and A* algorithms to search for plans. Experimental results show that the ARP-interleave approach outperforms alternatives that rely solely on search algorithms.
Planning and Acting System for Angry Birds with Refinement Methods
1. Deliberately Planning and Acting for Angry Birds with
Refinement Methods
Ruofei Du, Zebao Gao, Zheng Xu
CMSC 722 Technical Report advised by Prof. Nau & Dr. Shivashankar
Computer Science Dept., Univ. of Maryland, College Park
21. Related Work
Planning & Machine Learning
Most work focuses on
modeling the physical properties of the game or
utilize machines learning as empirical knowledge base.
None of the related work
compares the methods
using or not using refinements methods
in an ARP-interleave framework.
22. Contribution
Sample log of our system
0 [System] Program starts with configuration -d
32 [System] Server waiting on port: 9000
1051 [System] Client connected
1934 [System] Running A* agent without IRPE
8309 [Vision] Detecting objects in the Angry Birds using vision
module
10068 [Vision] Detect TNTs 0
10069 [Vision] Detect pigs and rolling 1, glass 5, stone 1, wood
27, birds 8
10069 [Vision] Statistics: 1, 34, 0
10213 [Planner] m1-destroyAllPigs: 1 pigs
10216 [Planner] m2-destroyPig
ab.vision.ABObject[x=536,y=290,width=14,height=10]
10216 [Planner] Estimate launch point from 2
10217 [Planner] launch point java.awt.Point[x=9,y=961], angle
73.79194556203768
17742 [Actor] Shooting starts 193, 328, -184, 633
27765 [Actor] Shooting completes
28096 [Vision] Scale factor = 1.0006048459391437
28267 [Vision] Detecting objects in the Angry Birds using vision
module
28943 [Vision] Detect TNTs 0
28944 [Vision] Detect pigs and rolling 0, glass 2, stone 1, wood
10, birds2
28944 [Vision] Statistics: 0, 13, 0
29116 [Planner] m1-destroyAllPigs: 1 pigs
29116 [Planner] m2-destroyPig
ab.vision.ABObject[x=536,y=290,width=14,height=10]
29117 [Planner] Estimate launch point from 2
29117 [Planner] launch point java.awt.Point[x=0,y=959], angle
72.99302990476194
36778 [Actor] no sling detected, can not execute the shot, will
re-segement the image
37049 [Vision] Detecting objects in the Angry Birds using vision
module
38198 [Vision] Detect TNTs 0
45. Actions failures
A shot may not accomplish the goal of destroying a
targeted pig
Challenge
World is seldom predictable
(Ghallab, Nau and Traverso, 2015)
46. Unexpected side effects of
actions
A shot may changing the future planning state by
destroying other pigs
Challenge
World is seldom predictable
(Ghallab, Nau and Traverso, 2015)
47. Exogenous events
A shot may put the stage into an unsolvable state like
surrounding a pig with irons
Challenge
World is seldom predictable
(Ghallab, Nau and Traverso, 2015)
51. It models the problem more
accurately
Advantages
ARP-interleave Framework
52. Plans generated by this system
are more practical
the actual status of the world are always taken into
consideration
Advantages
ARP-interleave Framework
53. Better decision
can be made about choosing the next plan..
Advantages
ARP-interleave Framework
67. Inaccurate Simulation
It is a challenging task to derive an exact model of a physical world.
The knowledge base of the physical world is built upon the prior
exploration with trials and errors. Even in a virtual world like Angry
Birds, it is almost impossible to model the world without reading its
source code.
Discussion
Why deterministic model fails
(Ghallab, Nau and Traverso, 2015)
68. Changes in Time
In practical scenarios like Angry Birds, the global states of any
object may be subject to change from time to time. For example,
it is very hard to know whether the scene is static or not after a
shot. The structure may be unstable and moving in a very slow
pace. As a result, the subtle changes in the global states may
lead to failure of the current plan.
Discussion
Why deterministic model fails
(Ghallab, Nau and Traverso, 2015)
69. Erroneous Actions
As for human beings, it is hard to perform an action such as
textit{move left hand 5 centimeters forward exactly}. Similarly,
for a video game like Angry Birds that involves human
interactions, the acting engine hardly points to the perfect
position in pixel-level. Consequently, the actor may produce
different states than the planner predicted.
Discussion
Why deterministic model fails
(Ghallab, Nau and Traverso, 2015)
70. 1. present a deliberately planning and acting system
with refinement methods for the popular Angry
Birds video game.
2. design a RAE based on ARP-interleave with
SeRPE. Greedy, DFFS and A* are incorporated to
search for planning
3. our experimental results show that ARP-interleave
outperforms their counterparts which purely rely
on the search algorithms
Conclusion
Summary
71. Deliberately Planning and Acting for Angry Birds with
Refinement Methods
Ruofei Du, Zebao Gao, Zheng Xu
CMSC 722 Technical Report advised by Prof. Nau & Dr. Shivashankar
Computer Science Dept., Univ. of Maryland, College Park
Thank
you!
Good morning, everyone. My name is Ruofei. This is Zebao, this is Zheng. Our topic is “Deliberately Planning and Acting for Angry Birds with Refinement Methods”
First of all, we’ll play our demo video for a quick overview of our project.
https://www.youtube.com/watch?v=u7XJ0g6d9po
Angry Birds is a popular video game by ROVIO in iOS, Android, PC, Mac and even on the web.
It has attracted over 200 million players all over the world since 2009.
Image credit: http://cdn2.hubspot.net/hub/202647/file-362198412-jpg/images/user_generated_content_is_the_future_of_ecommerce_2.jpg
It has attracted over 200 million players all over the world since 2009.
The task for the player is to shoot birds with different properties from a slingshot at a structure that houses pigs and to destroy all the pigs.
The task for the player is to shoot birds with different properties from a slingshot at a structure that houses pigs and to destroy all the pigs.
The task for the player is to shoot birds with different properties from a slingshot at a structure that houses pigs and to destroy all the pigs.
The task for the player is to shoot birds with different properties from a slingshot at a structure that houses pigs and to destroy all the pigs.
The structure can be very complicated and can involve a number of different object categories with different properties, such as wood, ice and stone, as shown in Fig. 1. After each shot, the entire game environment is subject to tremendous change.
The structure can be very complicated and can involve a number of different object categories with different properties, such as wood, ice and stone, as shown in Fig. 1. After each shot, the entire game environment is subject to tremendous change.
After each shot, the entire game environment is subject to tremendous change.
One motivation of our work is to assist new players to play the game. The other motivation is from IJCAI Angry Birds competition, which is due in July.
One motivation of our work is to assist new players to play the game. The other motivation is from IJCAI Angry Birds competition, which is due in July.
Past research on Angry Birds has largely focus on AI Planning or Machine Learning, as well as geometrical representation and physical modeling.
Past research on Angry Birds has largely focus on AI Planning or Machine Learning, as well as geometrical representation and physical modeling.
Past research on Angry Birds has largely focus on AI Planning or Machine Learning, as well as geometrical representation and physical modeling.
Past research on Angry Birds has largely focus on AI Planning or Machine Learning, as well as geometrical representation and physical modeling.
Past research on Angry Birds has largely focus on AI Planning or Machine Learning, as well as geometrical representation and physical modeling.
Past research on Angry Birds has largely focus on AI Planning or Machine Learning, as well as geometrical representation and physical modeling.
However,
Here we summarizes the contribution of our work as follows:
First
Second
Third
Eventually
Here is framework figure by our team. First, there is a chrome extension to capture screenshot from the game. Next, the game server decoded the screenshot and transfers to the automatic agent, the client side. Finally, the client uses a vision module provided by the IJCAI community and uses our planner to calculate the interactions for the game. They also provides an actor module
Zheng: Here is our problem formalism of the Angry Birds puzzles.
The following state variables are kept up-to-date by the vision component of angry birds
Eventually
Eventually
Eventually
Eventually
Eventually
Eventually
Eventually
Eventually
Eventually
Eventually
Eventually
Eventually
Eventually
The following state variables are kept up-to-date by the vision component of angry birds
Thanks Zheng for talking about problem formalism. So, One question we may have at this point is: What is difficult with Angry birds? What are the challenges? One of the biggest challenges is that the world is seldom predictable.
Even a well-trained player cannot tell the exact states of all objects after a single shot.
To be specific. First thing, a shot may not accomplish the goal of destroying a targeted pig.
Also. Actions often have unexpected side effects. A shot may changing the future planning state by destroying other pigs.
And Even worse. A shot may put the stage into an unsolvable state like surrounding a pig with irons.
So, How are we dealing with this? We are applying a planning and acting approach, ARP-interleave, which will replan for next step after each action.
Before each step, it will generate a plan based on the SeRPE refinement tree.
After each step, the system will check the current status of the world.
It will not only check the outcomes of the previous action.
Also, it will update the status of all the objects (like pigs, stones, woods, etc.) according to their latest actual state.
And If the current step fails to achieve the goal, the planner will try to use the remaining methods that are feasible in current state.
ARP-interleave well fit the purpose of solving the Angry Birds problem. Here are its advantages.
First, It models the problem more accurately
Second, Plans generated by this system are more practical
because the actual status of the world is always taken into consideration.
Finally, as a result, better decisions can be made about choosing the next plan.
Now let’s look at the search algorithms we are using to generate plans in the simulation process of SeRPE.
We adopt three forward search algorithms in our implementation: greedy search, DFFS and A*.
We’ll call these algorithms SePRE-Greedy,SePRE-DFFS and SePRE-A* respectively.
SeRPE-Greedy uses two heuristics:
The first heuristics always choose a pig as target.
The second heuristics will choose the pig that is closest to the sling.
In our planning and acting framework, simulation algorithm is called after each action.
And if we are using the second heuristic, for example, then each time SeRPE-Greedy is called, it will return a plan that hits the pigs one by one in the order of distance.
DFFS. One advantage of depth first search comparing with greedy search is that it guarantees to find a solution if one exists.
SeRPE-DFFS uses one heuristic: always choose an existing pig as a target. In our implementation, we maintain a Hashset to keep the visited states.
And finally, SeRPE-A*. A* surpasses greedy search and DFFS when the state space is not too large, as A* guarantees to return the optimal solution.
SeRPE-A* uses a heuristic based on scores of objects shown in the table, and it will estimate the cost by total scores of objects hit by the bird.
We conducted a set of experiments on a workstation in the GVIL using a Nvidia Quadro K6000
and an Intel Xeon Processor, 32GB RAM
Using the Chrome version of Angry Birds
Under 10 minutes for time limit
We use the first 9 levels of Angry Birds to evaluate different algorithms
We get the statistics of the highest score for each level
As well as average time cost to pass each level
The results are shown in these two figures. The SeRPE-A* algorithm outperforms the naive system without using SeRPE.
First
Second
Third
Third
Finally we would like to thank Prof. Nau and Vikas for teaching AI planning and advising the team. Thank you all for listening.
After each shot, the entire game environment is subject to tremendous change.