Best Practices for Ansible
Traducciones al EspañolEstamos traduciendo nuestros guías y tutoriales al Español. Es posible que usted esté viendo una traducción generada automáticamente. Estamos trabajando con traductores profesionales para verificar las traducciones de nuestro sitio web. Este proyecto es un trabajo en curso.
Ansible is an important open source automation tool and platform. It is used for configuration management, application deployment, task automation, and orchestration of complex workflows.
Ansible figures prominently in DevOps. It allows Information Technology (IT) administrators and developers to automate repetitive tasks and streamline the management and deployment of infrastructure, applications, and services. Ansible’s business and strategic features include:
- Agentless Architecture: Does not require installation of agents.
- Idempotency: Gives safe and reliable results from unreliable components.
- Portability: Operates consistently across different operating systems and into various flavors of cloud environments.
Data centers effectively require Ansible, or one of its competitors. Businesses operating at the data center scale have requirements for reliability, economy, scalability, and flexibility. These are exactly the advantages of Ansible. It makes data center operations more cost-effective, predictable, resilient, and responsive.
Ansible Fundamentals
The following is a list of key terms that cover the fundamental components and concepts associated with Ansible:
Target State: Ansible is a declarative language. It details target states for computing systems and how those states are achieved. It then takes responsibility for achievement of the target states. This creates a kind of teamwork between users and Ansible, where users take the lead in telling what they want, and Ansible works out the details of how it’s done. This is different from older styles of system administration and system administration tools.
An important aspect of target state is how it applies. Many practitioners have strong experience with Ansible’s use in provisioning and deployment, but don’t realize it also applies in other automations. While it is good at “spinning up” a new server or updating an existing one, it’s also handy for many more uses that aid overall system health. For example, daily checks of certificate expirations, or hourly confirmations that file systems have at least 10% free storage. It only takes a few lines of Ansible to implement these and many other target states and verifications.
Playbooks: Ansible playbooks are written in YAML and define a sequence of steps, or “plays”, to execute on a target system or group of systems. Playbooks express desired states for systems and how those states are achieved. Ansible then takes responsibility for achieving those states. That dynamic is Ansible’s fundamental accomplishment.
Modules: Ansible modules are the building blocks of playbooks. Modules are discrete units of code that enact specific tasks such as package management, file configuration, or launching services. One of Ansible’s great assets is its enormous collection of built–in modules and the ability for users to author custom ones.
Tasks: Ansible tasks are individual units within a playbook that call modules to perform specific actions. Tasks execute sequentially on target systems to achieve desired states.
Roles: Ansible roles organize and package related playbooks, variables, tasks, and other components into reusable and shareable units. Roles’ modularization of configuration promotes reusability across playbooks and projects.
Inventory: The inventory file defines the hosts and systems under consideration. Inventory can be either static or dynamic. It typically includes such information as hostnames, IP addresses, groups, and variables.
Variables: Ansible allows use of variables that make playbooks more dynamic and flexible. They also feature different scopes, including global, playbook, role, and task.
Facts: Ansible gathers information about target systems using modules called facts. Examples of gathered information include hardware, operating systems, and internet addresses. Playbooks inform the decisions they make with such facts.
Templates: Ansible templates are files structured in Jinja2 syntax with placeholders. Playbook execution dynamically populates the placeholders with variables. Templates can generate configuration files, scripts, and other Ansible artifacts.
Handlers: Various specific Ansible events trigger handlers, typically at the conclusion of a playbook run. A common handler responsibility is to restart services after a configuration change.
Ad-hoc Commands: Being declarative, Ansible is flexible enough to embed several imperative mechanisms which streamline and simplify particular operations. Its ad–hoc commands are indispensable for quick system health checks, troubleshooting, and other isolated remedies.
Ansible Best Practices
While best practices certainly improve run-time efficiency, they also improve organizational efficiency. They can promote teamwork, output reliable results with minimum effort, help onboarding, ease maintenance burdens, and even protect from legal liability.
As Abelson and Sussman wrote: “Programs must be written for people to read, and only incidentally for machines to execute.” In much the same way, the best Ansible playbooks are an ongoing asset for their human readers.
Recognize that Ansible playbooks and related specifications are source, or “code”. Like all other sources, they deserve a version-controlled source code control system to call home. Think of this as “best practice zero”, which precedes the following top 12 best practices for using Ansible.
File System Layout
Organize projects with a consistent file system layout. Separate playbooks from roles, in directories respectively named playbooks
and roles
. The result should look similar to the following example project directory structure:
project/
├── playbooks/
│ └── example_playbook.yml
├── roles/
│ └── example_role/
│ ├── tasks/
│ ├── handlers/
│ ├── templates/
│ └── …
│── ...
├── group_vars/
│ ├── all.yml
│ ├── production.yml
│ └── development.yml
│── ...
├── inventory/
│ ├── production_hosts
│ ├── staging_hosts
│ └── development_hosts
│── ...
├── vault/
│ ├── secret_file.yml
│ └── …
└── ...
Spelling of file names and other formal aspects of an Ansible project are purely cosmetic and aren’t considered actual Ansible programming. All the more reason to standardize on common practices others have identified and save your team’s attention for deeper matters. After all, IT is a collaborative undertaking.
The point of this best practice is less about the virtue or aesthetics of a directory spelled playbooks
rather than playbook
, and more about the benefits of a common language for the whole team. Using these best practices, teams can shift their attention from worrying about particular details to thinking more about how to work together toward larger business goals.
Ansible Configuration
Use ansible.cfg
for global configuration. Define sensible defaults for the inventory and roles paths. Use syntactic comments to document the reasons behind the choices made.
Explicitly set forks
to control parallelism. Configure pipelining
to limit ssh
operations and increase performance. Configure ControlPath
to share ssh
connections. Adjust timeout
and poll_interval
to manage timeout of long-running tasks.
Control verbosity of logging based on actual experience and measurements of the specific playbooks in use. Periodically review the configuration to ensure it is consistent with established policies and goals.
Playbook Design and Structure
Use Roles to modularize playbooks. Refer to Ansible Galaxy for inspiration regarding useful definitions of Roles. Define your own choices in requirements.yml
.
Consider segmenting large and complex playbooks into multiple smaller ones, each with a focus on a specific component or functionality. The resulting bundle of playbooks is likely easier to manage than the original complete one. An alternative way to structure complex playbooks is with tags. Tags effectively disable or enable pieces of a playbook. For instance, it’s sometimes beneficial to keep a playbook whole, while controlling distinct pieces within it.
Separate inventory, configuration, and variable information into environment-specific files.
Maintain a Vault for all sensitive or private information. This includes passwords, certificates, tokens, keys, or any other customer details Ansible needs to know. Consider the alternative of storing sensitive data in a file, and referring to it, rather than coding it into Ansible.
Periodically review and refactor the playbook structure to keep it fresh and well-aligned with project requirements.
Variable Names
Choose descriptive variable names. For instance, gateway
rather than gw
. However, also choose brief names over complicated ones.
Use snake case, for example, database_account
rather than DatabaseAccount
or other variations.
Document the purpose, usage, and range of variables with comments. Examples of particularly useful comments focused on specific variables include:
# hostname is case-insensitive, so that 'server1' and 'SERVER1' behave identically
# corpus_account must be qualified: 'name@domain.com' is OK, but 'name' is not
Group variables hierarchically. This is likely to result in such names as database_account
, database_host
, database_password
, and database_priority
. Environment-specific variables deserve meaningful prefixes such as prod_database_account
or env_database_account
.
Avoid shadowing reserved keywords. Rather than item
or serial
, choose server_item
or hardware_serial_number
.
Don’t be afraid to make exceptions to these rules when appropriate. For example, abbreviate gateway
down to gw
if a particular team has a well-understood, longstanding practice of doing so in other systems and languages beyond Ansible.
Last but not least, always be consistent across playbooks and projects.
Error Handling in Ansible
Error handling is supremely important. Most of the work accomplished by computing systems is done when things work as intended. However, a good playbook has more lines devoted to responding to failures than for what happens when everything goes right.
“Handling” encompasses everything from ignoring the error, to logging it, notifying a monitoring system, or launching a diagnostic process. The most important best practice in error-handling is explicit use of ignore_errors
and register
. When a particular condition is judged to be non-fatal, mark it with ignore_errors
, allowing the playbook to continue. Also mark it with an appropriate comment such as # Error here is non-fatal because ...
. Also handle error conditions conditionally with when: task_result.failed
. A different error-handling mechanism, block-rescue-always
, is applicable for resource management and cleaning up problematic states. Take advantage of Ansible’s assert
module. Don’t assume that a particular service is available before starting to use it. Instead, assert
its availability beforehand.
Learn Ansible’s built-in logging, auditing, debugging, and exit code functionality to handle errors most effectively.
Ansible Logging
Ansible logging has several roles. Learn the essentials by practicing with the log and debug modules. In its simplest form, log can deliver a message during playbook execution through a specification such as:
- name: Log a single diagnostic
log:
msg: "This is the diagnostic logged at this point"
Next, learn how to configure callback_whitelist
, log_level
, and log_path
in ansible.cfg
. Experiment with log_level
to learn how the categories CRITICAL
, DEBUG
, ERROR
, INFO
, and WARNING
apply in your projects. Set them to meet the needs of specific applications and your own preferences. Some administrators like to log everything that might be useful, but only enable CRITICAL
for daily operations. Others only log diagnostics that are guaranteed to demand response. Either approach, or various alternatives between them, work. It’s more important is to be consistent about which style you choose.
Ansible’s callback plugins naturally apply to many logging situations. For example, when you want to customize output formats, escalate notifications to email, profile performance metrics during an incident, or otherwise meet logging requirements.
Timestamp your log entries. Rotate logs to ensure efficient use of storage. Archive logs for auditing and compliance. Treat logs as sensitive information that deserve security controls, so configure access only to users with a need to view them.
With logging basics in place, consider more sophisticated log management through such aggregators as Elasticsearch, Logstash, Kibana, and Splunk. These help scale your ability to analyze logs.
Decide on a review policy. No well-founded best practice applies universally in regard to how and when to review logs. It’s best to be realistic. If decision-makers believe that logs need to be read, allocate time to do so as an explicit policy, and track the results.
Inline Comments
Comments are important and rewarding, although widely under-used in real-world practice. Each time you write a line of Ansible, ask yourself: what would help me understand the intent of this if I return here six months from now? Ideally, your playbooks should be so simple and idiomatic that their source speaks for itself. The next best thing to that ideal situation is source that’s so well-commented that it answers any questions that naturally arise. Always write good comments, and insist that your whole team does, too.
READMEs
Write a README
for each directory and subdirectory in a project. It could be something as brief as:
- File: README.md
1 2 3 4
# Variables for the Staging environment This specification details the Ansible variables which are specific to actions in the staging environment.
Other README
files can be several hundred words about the architecture and design decisions that a particular directory represents. Three natural best practices applicable to README
files are:
- Create exactly one
README.md
for each directory in a project. - Format the contents as well-formed Markdown.
- Provide high-level “philosophy” and motivation in the README. Leave technical details to source comments. Minimize repetition of the source comments, and instead refer to them in
README
files.
Playbook Documentation
Prepare a top-level README.md
which explains the purpose and use of the playbook. Provide the reader with at least one way to test the playbook. In other words, explain how to do something and what the result should be. Include examples, use cases, and references to relevant documents. Provide a link to your policy on the subject. If some aspect of the playbook is hard to explain, it’s even more important to explain it. Use dataflow, state, or entity diagrams, as appropriate.
List Ansible versions that the playbook is compatible with. Include license and copyright notices in the README.md
. Review the README.md
periodically to make sure it aligns with the current state of the playbook. The result is a playbook that is easier to use and maintain correctly, particularly for those not involved in its original creation.
Use of Vaults for Sensitive Data
Encrypt sensitive data, including variables, configurations, task contents, and whole files. However, do not encrypt information that is not sensitive. Store sensitive variables in vars/secrets.yml
and encrypt the file. Reference the encrypted variables in the playbooks as necessary.
Control access to the Vault with such controls as file systems permissions. Only allow authorized users to decrypt sensitive data, and only with a proper decryption key. Choose strong passwords and encryption keys. Write explicit policies for rotation schedules and practice rotation to ensure that it’s correctly executed.
Passwords must not appear in playbooks. Use a credential manager, or at least prompt for necessary passwords to be supplied at runtime.
Secure Communication
Ansible projects generally communicate by way of ssh
, and ssh
best practices include the following:
- Choose strong keys
- Choose strong ciphers
- Secure configurations
- Choose key–based authentication rather than password authentication
- Distribute keys securely
- Minimize agent forwarding
- Define appropriate policies, including a limit on login failures and idle timeouts.
- Consider configuring access-control, restrictions on authentication methods, and lists of allowed users through
sshd_config
.
Configure Ansible to use SSL for communication between control nodes and managed hosts. Control access to the control node with firewalling and network restrictions. Patch Ansible components regularly and enable two-factor authentication (2FA).
If your project uses Ansible APIs, configure communication for TLS. If your project uses Ansible Tower or AWX, configure HTTPS.
Privilege Escalation and Sudo
Learn Ansible’s become
feature, designed expressly for secure privilege escalation. Apply become
precisely, for a single play at a time, rather than for an entire playbook. Use become_user
as an additional way to increase the precision of an escalation. Study become_method
to understand the applicability of different escalation methods. For instance, while sudo
is the default, a PowerBroker-equipped environment needs to favor pbrun
. Review escalation uses periodically.
Conclusion
The biggest payoffs in regard to Ansible best practices come from routine, non-technical habits. This includes maintaining playbook version control and accurate documentation, separating secrets from public information, roles from actions, and targets from implementations. Update your Ansible instance through a well-defined software development lifecycle (SDLC), and use tools such as Ansible Lint appropriately. Make sure every line of code exists for a reason.
While these habits are not technically deep, with them in place across your teams, Ansible’s best practices do pay off.
This page was originally published on