Best practices for ansible

A collection of high and low-level best practices for large project in Ansible.

  1. 1. Best practices for Ansible
  2. 2. Introduction What is Ansible? ● A configuration management system ● Agentless design: ‘controller’ (admin’s localhost) supervise everything ● No mandatory data server to work with. ● Uses ssh as a primal transport, but there are many other transports too.
  3. 3. An example nginx: ● Install ● Configure reverse-proxy for an application
  4. 4. Name of things
  5. 5. Name of things ● Task + task + task => tasklist ● Tasks + vars + defaults => role ● Tasklist + hosts => play ● Play + play + … = playbook ● Playbooks + inventories = ansible repo (unofficial)
  6. 6. modules ● Each module configure specific thing on the host ● Examples: ○ template ○ apt ○ systemd ○ stat ○ postgresql_user ○ object_storage ○ cron ○ crm_resource ○ … ○ ~ 2200 modules in ansible 2.4
  7. 7. variables & templates Ansible allow to use variables to pass argument to modules. - Each variable is processed with jinja2 template engine - Tasks can register variables, there is a set_fact module - Each task, play and role may have own local-scoped variables - Nested definition is OK - Recursion is prohibited - Variables are expanded at the moment of use (in modules and conditions) - Dedicated templates for configs are processed the same way as variables
  8. 8. handlers ● Are called if affected task was changed ● Are called once per play ● Can be flushed (called) earlier with meta: flush_handlers ● Have a play visibility ● Roles can notify each other hander’s: ○ It’s complicated. Try to avoid this. ● Can listen to other handler’s notification ● Are called in order of declaration, not in order of notifications ● Error handling/retry policy: at most once ○ This is bad
  9. 9. handlers and includes include_role import_role inner action outer action inner + outer action inner action outer action inner + outer action INNER + OUTER hander outer only outer only outer only inner only inner only inner only INNER handler only inner NOT FOUND NOT FOUND inner inner inner OUTER handler only outer outer outer outer outer outer https://github.com/amarao/ansible_import_include_and_handlers
  10. 10. Conditionals ● evaluated at the moment of execution ● Evaluated on every iteration for loops ● Separately for each entry in ‘block’ ● Have a special hack for ‘is defined’
  11. 11. Loops - All of them are slow and clumsy. - Ansible 2.5: iter_items → loops. - Complicated branching is bad. - Complexity is bad. loop_control: loop_var: user label: ‘{{user.short_name}} at {{user_department}}’
  12. 12. idempotency ● Each task or fail, or change something, or ‘success (no change)’, or skipped ● Each task should report change only if there are changes made. ● Second run of the same task should yield ‘no change’ Important for: - Testing - Stability and audit - Handler’s calls
  13. 13. Ansible is not a programming language. ansible developer
  14. 14. What is ‘big’ means for an ansible project? Kubespray ● 911 files ● 49132 lines Openstack-ansible ● 1196 files ● 52504 lines Openshift-ansible ● 1668 files ● 175745 lines ● Estimated yaml multiplicator for line count: ~x3
  15. 15. Not-a-code consequences ● Global variables everywhere ● foo: ‘{{foo + 1}}’ is officially broken. Forever. ● A practical call stack depth: 3-5 ● It’s hard to change values in dictionaries and lists ● Data queries are crazy and complicated (json_query filter in Jinja2):
  16. 16. Sources of pain ● Dependencies ● Slow execution over ssh ● Memory hogging on includes (partially fixed in 2.4.3 and 2.5) ● Data query ● Rudimental modularity ● Name conflicts ● Non-typed interfaces between roles ● A horrible error reporting for jinja2 templates/filters ● Unpredictable visibility for global variables ● Variable precedence is complicated and is broken in include_role.
  17. 17. Ansible is a muscle, not a skeleton ● Everything is permitted ● Most errors are detected at runtime ○ Or even silently succeeded with incorrect behavior ● No universally accepted style guide (* try ansible-lint) ● No well-known design patterns ● Best practices are at level of elementary school Why do we still use Ansible? Because it’s the best we have insofar.
  18. 18. Some bones to build a skeleton 1. Execution flow: tasks and roles are assigned to hosts 2. Hosts are the first class objects to work with 3. Groups and groups inheritance to keep relations between hosts 4. Group variables 5. A simple iteration over lists 6. Transparent access to hosts ‘by ansible magic’ ... I wish I this list would be longer...
  19. 19. Best practices (High level)
  20. 20. No overengineering It’s not java or python. Every act of overengineering bites you badly. ● Play is better than role ● Role is better than play, repeated twice in two different playbooks ● Tasklist in a role is better than a second role ● If you can join two roles through a play, use the play ○ If you can’t - use a wrapper role ● Play for host is better than delegate_to in task ● Delegate_to is better than poking into hostvars of other host ● Everytime you iterate over hosts in a group, God kills a cat
  21. 21. Project layout: partitioning ● Сommon basics: users, basic packages (vim/iptables), hostname, ssh keys ● Project-specific simple configuration (standard software && simple configs) ● Non-trivial configuration for standard software: e.g. databases, pacemaker ● Non-standard software (custom apps, git deploy, venv, etc) ● Ad-hoc scripts, cron jobs, etc ● Monitoring ● Bootstrap code (run-once tasks, initialization, etc) ● Upgrade procedure(s) ● Recovery procedures
  22. 22. Project layout Included in site.yaml ● Users and basic software ● Software installation and configuration ● Database creation ● Monitoring Used separately: ● Bootstrap ● Update procedure ● Recovery procedure ● Helper scripts for staging ○ Copy data from production ○ Tests for recovered system ○ Creation/teardown for staging ● Inventory update/generation
  23. 23. Scope reduction Each piece of code should work within its own domain: If we configure application foo we shouldn’t touch random bits outside of foo: ❌ NO ● add nginx configuration for foo ● use this magic query to find database IP ● transform list of users from global userlist to foo format ✅ YES ● Use wrapper role to configure nginx (include_role, import_role) ● Use role to search database IP ● Pass userlist explicitly from playbook or another wrapper role
  24. 24. There is no the sane way to describe dependencies. - Old style (with dependencies in meta) do not work and is been deprecating. - New style include_role/import_role ignores meta-dependecies. The single way to create dependency is to do it manually. - import_role when role_foo_called is not defined - set_fact: role_foo_called inside a role Or, just call it twice if it’s fast. Explicit dependencies
  25. 25. Name it! Name it right! Examples: ● Everything should have a hyperonym (common name for few things) ○ F.e. ‘configuration playbooks’ VS ‘script playbooks’ ○ Configuration playbooks should be linted to the perfection ○ Script playbooks may have unconditional ‘command/shell’ with ‘changed always’ status ● Different types of groups ○ F.e. ‘Execution groups’ VS ‘groups for variables’ ○ Groups for variables should never have assigned tasks (f.e. hosts: database_settings) ● Name your components! ○ F.e. ‘bgp-push’ VS ‘bgp-pull’, ‘agents’, ‘central’, ‘external_access’, etc. “Naming things” is the 2nd hard computing problem
  26. 26. Best practices (low-level details)
  27. 27. Simple tricks ● ansible -i staging --list-hosts all ● ansible-playbook -i staging site.yaml --list-tags ○ Tags should have meaning! ● ansible-playbook -i staging site.yaml --check --diff
  28. 28. Ansible-lint !!!!!!!!!!111 one one one ● Points to subtle errors in the code playbooks ● Best practices (handlers vs “when: foo|changed” filter) ● Clarity. If lint understand that, people understand that. ● Force more semantic on shell/command How much time it takes? ● ~ 30 lint warnings per hour. ● I cleared my project within 4 hours. There where 3 real-life bugs and 10 minor improvements, all found by ansible-linter
  29. 29. Shell and command modules ● Main source of chaos if used inaccurately ● Rules: ○ If they gather information: changed_when: False ○ If they are idempotent: find a way to report changes. ○ If they are not idempotent: use only after query: ■ where: ‘foo’ in previous_query.stdout ■ where: previous_query.rc == 2 ● You can refactor if those modules are idempotent ● You can not refactor if those modules are not idempotent
  30. 30. shell drama And if I can’t detect changes or failure? You are doing it wrong. Find a way. .
  31. 31. shell example ip link set up command always returns 0, and never gives output. ❌ NO - name: Link up shell: | ip link set up dev {{dev}} ✅ YES - name: Check link status command: ip link show {{dev}} register: link_status changed_when: False - name: Link up command: ip link set up dev {{dev}} when: ‘UP’ not in link_status.stdout
  32. 32. shell example #2 foobar does not report failures at all. We want to execute foobar add and we can to do foobar list . ❌ NO - name: Add to foobar shell: | foobar add {{obj}} ✅ YES - name: Check foobar status register: old_fobar_output changed_when: False - name: Add to foobar shell: | foobar add {{obj}} && foobar list register: new_foobar when: obj not in old_foobar_output failed_when: obj not in new_foobar
  33. 33. Apt: update_cache Theoretical question: is it updated or not? For practical reasons answer is: no changes Option 1: integrate into install - name: Install foo become: yes apt: name: foo state: {{foo_install_state}} update_cache: {{apt_update_cache}} cache_valid_time: {{apt_cache_valid_time}} Option 2: use without changes - name: Update apt cache become: yes apt: update_cache: yes cache_valid_time: {{cache_time}} changed_when: False
  34. 34. Best practices (workflow)
  35. 35. Staging MUST HAVE STAGING AT ANY COST Staging: ● Finds your bugs before production ● Helps to refactor ● Forces you to think of modularity
  36. 36. Development environment Primary staging: ● virtual machines or real servers. Imitate production as close as possible Development environment(s): ● Almost like staging, but faster and with omissions ● LXC (or docker) at localhost speedup runs for ~30-50% ● Deploy containers by Ansible, drop them by ansible ● Automate rebuild
  37. 37. CI/CD ● Delegate all Ansible tasks to CI/CD server (Jenkins?) ● One job for production, one for staging ● Software updates and other workflow tasks - separate jobs ● Production should be updated only through CI/CD server ○ Keep logs ○ Keep last deployed commit* in those logs ● *Do you use git for your playbooks? You should. ● Run production ‘full ansible run’ often. ○ Make it safe. Second full run = zero changes. Mandatory to have. ● Run staging ‘full ansible run’ before production for all changes. ○ It guards production and saves your face.
  38. 38. New and reinstalled servers Bootstrap.yaml: ● Forget old ssh keys ● Remember new ones ● Install python, ssh keys, creates users ● Install all upgrades, restart server
  39. 39. Per role tests + Ansible way to test roles + Easier to debug - Time consuming - No inter-role integration - Often meaningless without a context
  40. 40. Variables & environments
  41. 41. Places to hide a variable ● Inventory (host, group_name:vars) ● inventory/host_vars ● inventory/group_vars ● host_vars ● group_vars [all.yaml, group_name.yaml] ● roles/default ● roles/vars ● ‘vars:’ in any task or role ● register in any task ● import_vars ● defaults/vars of imported role Ansible variables without supervision
  42. 42. Rules to keep sanity ● host_vars are banned anywhere except an inventory ● Roles/vars should be avoided ● Roles should avoid to expose variables to other roles in the same play(book) ○ Reduce global state, OK? ○ If they do - this is called an ‘interface’. Document it. ■ Example: search-fo-database-ip can set a variable db_ip. ● Environment-specific variables are kept in the inventory ● Project-specific variables are kept in group_vars ● Roles should use defaults for rarely changed variables ● Use local ‘vars:’ statement for task-local calculations
  43. 43. Variables and environments Environments: ● production/ ● staging/ ● lab1/ Variables: ● user_list -> group_vars/all.yaml ● domain_prefix -> inventory/group_vars/all.yaml ● foo_listen_port -> group_vars/foo.yaml ● db_password ->inventory/group_vars/dbaccess.yaml ● retry_timeout ->roles/foo/default/main.yaml Rule of thumb You must be able to add another environment by creating a new inventory (file/directory) with no changes outside that inventory.
  44. 44. How long to think before adding a variable roles/foo/tasks/*.yaml (vars section for task) 5 seconds no docs roles/foo/defaults/main.yaml 30 seconds role docs roles/foo/tasks/*.yaml (register) 1 minute no docs roles/foo/tasks/*.yaml (set_fact, role-internal) 1 minute no docs group_var 10 minutes role or project docs Inventory 30 minutes role or project docs roles/foo/tasks/.*.yaml (set_fact, external use outside of the role) 60+ minutes role and project docs Mandatory! For use in a command line (ansible-playbook -e) 60+ minutes role and project docs Mandatory!
  45. 45. Assertions and validations - name: validating variables Fail: msg: "please choose scenario" when: - osd_group_name is defined - osd_group_name in group_names - not containerized_deployment - osd_scenario == 'dummy' From ceph-ansible - name: Check ansible version run_once: True assert: that: "ansible_version.full|version_compare('2.4','>=')" msg: > "You must update Ansible to at least 2.4" delegate_to: localhost tags: - always fail module with ‘when’ assert module
  46. 46. Tags
  47. 47. Tags proliferation - name: Configure foo template: src=foo.conf.j2 dest=/etc/foo.conf notify: restart foo tags: - foo
  48. 48. Tags proliferation - name: Configure foo template: src=foo.conf.j2 dest=/etc/foo.conf notify: restart foo tags: - foo - configure
  49. 49. Tags proliferation - name: Configure foo template: src=foo.conf.j2 dest=/etc/foo.conf notify: restart foo tags: - foo - configure - restart
  50. 50. Tags proliferation - name: Configure foo become: yes template: src=foo.conf.j2 dest=/etc/foo.conf notify: restart foo tags: - foo - configure - restart - become
  51. 51. Tags proliferation - name: Configure foo become: yes template: src=foo.conf.j2 dest=/etc/foo.conf notify: restart foo tags: - foo - configure - restart - become - ip
  52. 52. Tags proliferation - name: Configure foo become: yes template: src=foo.conf.j2 dest=/etc/foo.conf notify: restart foo tags: - foo - configure - restart - become - ip - dont_do_like_this
  53. 53. Concise tags Including tags: ● One tag - one scenario ● --tags your_tag should either: ○ Finish successfully for a new installation ○ Finish successfully for an existing installation ● If you have some tag for few plays in a playbook, may be it’s better to split it to separate playbook and use include_playbook. Excluding tags: ● Should be used with --skip-tags ● For long or complicated operations only. ● Each ‘always’ tag should have additional tag for skip: - debug: var=foo tags: - always - debug_foo
  54. 54. tag examples - apt (all operations with apt, in all roles) - registrations (all operations with registration in a project API, in all roles) - foo_upgrade (all apt operations to install components of foo project) - git (all operations related to git pull/clone) - ip (all operations related to adding/removing IP addresses on server) - discovery ( all ‘search-for-*-ip’ roles) - services (tasks to configure shinken services, ~80 of them, shinken only) - drop (specific for copy-database.yaml, tasks to drop database)
  55. 55. -- limit
  56. 56. To limit or not to limit? Line in a template: allow_ip = {% for h in group.all %} {{(hostvars[h]).ansible_default_ipv4.address}} {% endfor %} ansible-playbook -i inventory test.yaml ✅ ansible-playbook -i inventory test.yaml --limit host1 ❌ fatal: [host2]: FAILED! => {"changed": false, "msg": "dict has no element ansible_default_ipv4"}
  57. 57. Solutions We need information about all hosts, but we have used --limit 1. Forbid to use limits in project 😟 2. Write a partial content 😓 3. Lineinfile on per-host basis 😦 4. Gather facts for all hosts forcefully 😥 5. Use fact cache 😕 6. Use external database 😖 7. Skip task if not a full run 🤔
  58. 58. Partial content {% for h in group.all %} {% if (hostvars[h]).ansible_default_ipv4 is defined %} {{(hostvars[h]).ansible_default_ipv4.address}} {% endfor %} {% endfor %} Good: none Bad: - incomplete config - ‘changed’ for each time with different --limit❌
  59. 59. Lineinfile - name: Add host to config lineinfile: path=/etc/foo.conf line=”host {{(hostvars[item]).ansible_default_ipv4.address}}” when: (hostvars[item]).ansible_default_ipv4 is defined with_items: groups.all Good: survive --limit with no changes or broken config Bad: old values are not removed Note: Can be used only if config use one IP per line
  60. 60. Forceful fact gathering - setup: subset=network delegate_to: {{item}} delegate_facts: yes with_items: groups.all when: (hostvars[item]).ansible_default_ipv4 is not defined tags: - always - gather_facts Good: - no random ‘changed’ - Always full config - remove old values - fast (see ‘when’ part) Bad: - fails if any host is down or is not provisioned yet
  61. 61. Fact cache ● Do as in forceful fact gathering ● Set fact caching in ansible.cfg ● Hope it will be there Good: - Works most of the time Bad: always - most = bugs sometime
  62. 62. External database ● Register each host in etcd/consul ● Query data on each run Good: Works with --limit Bad: External service dependency (down/provision) Removal of the old entities is a problem
  63. 63. Skip if not full run - name: Configure foo template: src=foo.conf.j2 dest=/etc/foo.conf when: full_run vars: full_run: '{{play_hosts == groups.all}}' Good: - Works perfectly with --limit - Won’t fail if some host is down and --limit was used - Fast - Updates and removes old data as needed on each full run Bad: - Does not update config if --limit ✅
  64. 64. templates
  65. 65. Template & task relationship ● Keep templates as simple as possible ● Use ‘vars:’ section for explicit variable declaration ● Never use global variables in a template. Exceptions: ○ Iterations over all hosts ○ Ansible built-in variables ○ A special global variable documented in a project and in a role ○ Very complicated queries. Use comments in the task to list used variables inside the template.
  66. 66. Simplify If a template is small, use ‘copy’ with ‘content’ argument to inline it - template: dest: /etc/foobar.conf content: | source_ip = {{ansible_default_ipv4.address}}
  67. 67. Debugging templates: variables - debug var={{item}} with_items: - myvar1 - myvar2 - ansible_default_ipv4 - all_other_variables_in_template
  68. 68. Debugging templates: Jinja2 Explicit templatization in a separate playbook (f.e. temp.yaml) - template: src=roles/somerole/templates/foo.conf.j2 dest: /tmp/foo.conf delegate_to: localhost transport: local vars: - some_var - another_var
  69. 69. Templates everywhere You don’t need to use ‘template’ to use jinja2. Every variable is a {{template}}. - copy - lineinfile - blockinfile - all file names for all copy/stat/file modules - arguments to shell and command modules - all other modules (apt, postgres_user, etc)
  70. 70. External Jinja2 - name: Ugly example foo: argument: ‘{{(hostvars[var1]).cust_facts[3]|json_query(“[?name=”+ .. - name: Better example foo: argument={{foo_argument}} vars: Foo_argument: ‘{{lookup(‘template’, ‘foo_arguments.j2’)}}
  71. 71. Roles
  72. 72. Roles: structure 1. Use defaults for rarely changed values. Do not use hard-coded constants. 2. Split role in parts 3. Allow to call role parts independently 4. Allow to reuse part of the role 5. Use call caching Nginx: install + configure site roles/nginx/tasks/main.yaml: - import_tasklist: install.yaml - import_tasklist: configure_site.yaml - import_role: name: nginx tasks_from: configure_site.yaml vars: nginx_site: ... - name : install nginx apt: name=nginx state=installed when: nginx_installed is not defined register: nginx_installed
  73. 73. Files in roles: vendor in role Good: - Easy to do: file: src=myfile dest=/var/lib/foo/myfile - Single authority - Versions Bad: - Keep golden artifacts in the ansible repo
  74. 74. Files in roles: external source Good: - A tidy git. Bad: - Need external storage. - Version control. Examples private apt repo || private git repo || swift container (bad!)
  75. 75. Wrapper role We have application server foo which should reside behind nginx. ● Foo want database IP, port address to listen ● Nginx need port to proxy_pass, domain, and ssl settings Role foo configure foo only. Role nginx configure any nginx site and it needs bunch of additional variables. Wrapper role glues them together, but does not change anything in foo or nginx.
  76. 76. Wrapper role - name: Configure foo for {{foo_source_ip}} include_role: name=foo tasks_from=configure_foo vars: local_api_ip: '{{foo_local_ip}}' local_api_port: '{{foo_local_port}}' - name: Configure nginx for {{foo_source_ip}} include_role: name=nginx tasks_from=configure_site vars: nginx_sites: - name: 'rttgod_{{foo_source_ip}}' listen_address: '{{foo_source_ip}}’ port: '{{foo_external_api_port}}' locations: proxy_pass: 'http://{{foo_local_ip}}:{{foo_local_port}}
  77. 77. Include_role VS import_role import_role: - Make it like it was written in the place of ‘include’. - Can override handlers - Defaults are respected (imported role use own default, but does not change parents defaults) - Does not support loops - Supports conditions: - A condition is applied to each task in the import_role role.
  78. 78. Include_role VS import_role include_role: - Supports loops - Absolute mess - Broken in each new ansible release in a new way (hello, 2.5): - Delegation - Handlers - Defaults vs set_fact - Parent’s variable access - include_tasks is much more reasonable, but requires more files and lines.
  79. 79. A proper looping with an include in a role - name: Loop over something Include_tasks: per_something.yaml with_items: ‘{{something}}’ - Name: in per_something.yaml import_role: name=foo vars: var1: ‘{{item}}’ - name: A task in role ‘foo’ foo: arg=var1 delegate_to: Works in ansible 2.5!
  80. 80. handlers
  81. 81. handlers ● Avoid cross-role handlers (except for wrapper roles) ● Use meta: flush_handlers
  82. 82. At least once persistent handlers role/tasks/main.yaml: - name: setup foo apt: name=foo state=installed notify: foo installed - … other tasks here… - meta: flush_handlers - name: check if restart is needed stat: path={{foo_flag}} register: foo_restart_flag - block: - name: Restart foo service name=foo state=restarted - name: cleanup restart flag file: path={{foo_flag}} state=absent when: foo_restart_flag.stat.exists handlers/main.yaml: - name: foo installed file: path: ‘{{foo_flag}}’ state: touch role/vars/main.yaml: foo_flag: /var/run/foo-inst.flag
  83. 83. Plugins
  84. 84. Plugin types module ≠ plugin - lookup_plugins/ - Load data from external sources - Perform calculations and queries - Iterate - action_plugins/ - Do stuff on hosts - vars_plugins - inventory_plugins All plugins are written in Python, and can be stored in ‘*_plugins/’ directory near a playbook, or within a role.
  85. 85. Lookup plugins 1. Try to do it with ansible. 2. Try to do it with in-line jinja2 template 3. Try to do it with in-line json_query 4. Try to do it with external jinja2_template 5. If not, write a plugin Rule of thumb: if jinja2 template more then ⅓ of plugin (and it’s tests), write a plugin. If less, use a jinja2. Python in ansible complicates reading! A lot. Plugin without tests is worse then jinja2 of any complexity.
  86. 86. Lookup plugins: an example from __future__ import (absolute_import, division, print_function) __metaclass__ = type from ansible.plugins.lookup import LookupBase import copy class LookupModule(LookupBase): def run(self, terms, **kwargs): data = terms or kwargs assigned_something = data['assigned_something'] assigned_others = data['assigned_others'] somethings = data['somethings'] foo_source_ips = [] for something in somethings: for data in something.get('datas', []): if data['other'] in assigned_others: foo_source_ips.append(data['foo_source_ip']) return foo_source_ips
  87. 87. Lookup plugins: an example - name: Register IP Uri: method: PUT url: ‘{{url}}’ body_format: json body: '{"something": "{{item["something"]}}","other": "{{item["other"]”[data"]}}}"}' Status_code: - 200 - 201 - 304 register: reg_status changed_when: reg_status.status in [200, 201] with_my_custom_filter: '{{something}}'
  88. 88. Lookup plugins: json_query equivalent - name: looping over include_tasks: process_other.yaml with_items: '{{selected_datas}}' Loop_control: loop_var: data label: '{{other}} @ {{data.foo_source_ip|default("no ip")}}' when: data.foo_source_ip is defined and data.other in assigned_others vars: somethings: '{{global_config["somethings"]}}' query: "[?name=='{{assigned_something}}'].datas" selected_datas: '{{global_config.somethings|json_query(query)}}' foo_source_ip: '{{data.foo_source_ip}}' something: '{{assigned_something}}' other: '{{data.other}}'
  89. 89. Other plugins I have no experience with them, sorry. Key ideas for action plugins, when to write them: - Too many too complicated command/shells in a playbook/role - Needed reusability - Better test coverage - Complicated data types in use
  90. 90. Refactoring
  91. 91. Refactoring Adding features Cleaning up the mess
  92. 92. Refactoring when adding features ● Use small steps ● Write a plan for refactoring before changing anything ● Paper drawing is advised. ● Use ‘not changed’ status to see if refactoring does not change anything ● Use ansible-playbook --check --diff ● Do two steps refactoring: ○ Change internals without changes in the result ○ Do small, simple changes which to change the result ● Do not forget to add cleanup code if needed ○ Drop it later ● Each step should have separate commit with a multi-line description ○ You can do this, I believe in you!
  93. 93. Refactoring when cleaning up mess - Find scenarios for execution - Eliminate false ‘changed’ - Reduce spread between files (no hostvars!) - Split plays into playbooks - Split tasklist into roles - Replace hardcoded values with variables - In templates too! - Do you remember about staging? - Reduce complexity of queries and iterations - Replace ‘shell/command’ with modules - Ansible-lint
  94. 94. Refactoring example: Scraps from my table ● Write all ideas, even discarded. ● Write all variables and file names you’ve introduced or changed ● Draw arrows between objects
  95. 95. THE END Final advice: ● Every role and every playbook cut the corners. ● Cut as few corners as possible. ● Each ‘cut corner’ has consequences. ● Amount of time dedicated to a role or to a playbook is a function of it’s importance. Be safe, be reasonable, and let ansible-lint to be with you.