[RFC] EmbeddedAnsible with ansible-runner-based implementation

<details>
<summary>Architecture</summary>

#### General approach

The current AWX implementation work by creating a provider that talks to an AWX instance, and uses the provider refresh to pull data into the database.  CRUD operations on AWX objects go through the provider API, where the object is created in AWX, and then brought in via EMS refresh.  After that callers use the ManageIQ models to do whatever they need to with the data.

As such, all of the ManageIQ callers use the provider API as an abstraction layer, and we can take advantage of that.  Instead of have provider CRUD operations go to a provider, we can instead write the data directly into the database tables as if a "refresh" had occurred immediately.

#### Repositories
A repository is created as a `ManageIQ::Providers::EmbeddedAnsible::AutomationManager::ConfigurationScriptSource` (`< ConfigurationScriptSource`).  For the implementation in this PR, the git repos are cloned into `Rails.root.join("tmp/git_repos/:id")`.  This works great for single appliance, but will not work as well for federated appliances, nor appliances that can't access the internet directly.  As such a different design is needed, which is below in the *git repo management* section.

Once the repository is cloned, then the playbooks are each synced as a `ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Playbook` (`< ConfigurationScriptPayload < ConfigurationScriptBase` (table name `configuration_scripts`).  In this PR I've also pulled in the "name" attribute as the playbook description, though I'm not sure if this is correct or not.

#### Service Template

When designing a service, the service template is saved as a `ManageIQ::Providers::EmbeddedAnsible::AutomationManager::ConfigurationScript` which is a subclass of `ConfigurationScript`, which is a subclass of `ConfigurationScriptBase` (table name `configuration_scripts`).

**CONFUSION NOTE**: Both services templates and playbooks are stored in the same table, but with different subclasses and different column usage.  Additionally confusing is that unlike playbooks which create a subclass with that native term, the class here is ConfigurationScript instead of the native term JobTemplate, but some of the relationships use the term job_template instead.

For the purposes of this PoC, I've stored some of the options for the service template in the `variables` column, but I don't believe that is the correct way to do it.  We will have to go back to the original design to see where the Tower provider stores those values during refresh.

#### Service execute

When an ansible service template is ordered, a `ServiceTemplateProvisionRequest` (`< MiqRequest`) is started, which goes through automate, and ultimately an instance of a `ServiceAnsiblePlaybook` (`< Service`) is executed.  In the general Service flow there are 2 main methods that need to be implemented, `execute` and `check_completed`.  In the `execute` method a `ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Job` (`< OrchestrationStack`) is created as a resource for this service, and "launched", moving on to the `check_completed` step.

#### Launching ansible-runner

For launching ansible-runner, we are using the `ManageIQ::Providers::AnsibleRunnerWorkflow` class which will eventually use `Ansible::Runner` helper class.  (Note: this workflow class was created as a helper for provider authors to create ansible based operations, however, the code itself is not provider specific and this code should be moved out of the providers namespace and into the `Ansible::Runner` namespace instead).  

**CONFUSION NOTE**: The workflow class is a subclass of `::Job`, which is our generic state machine using `MiqTask`s.  This is completely unrelated to `ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Job`, which is just a resource representation for the service.

The `AnsibleRunnerWorkflow`, being a self-contained `Job` will launch ansible-runner with json output, asynchronously poll if the ansible-runner execution has completed, and once it has detected completion, it will grab the results, store them in the `MiqTask` context, and cleanup the ansible-runner execution temp directory.

#### Service check_completed

In the meantime, the `check_completed` step of the `ServiceAnsiblePlaybook` is run every so often.  In this implementation, the `MiqTask` associated with the `AnsibleRunnerWorkflow` is being watched for completion.  Once it has been marked as finished, then the service can move on with its post-execution steps.

#### Services page

The services page shows the details of the `ServiceAnsiblePlaybook`, and the user can drill into the provision details.  One of those details is the ansible `stdout`.  In the AWX-based implementation, this was one of the few places where the database records were not used, and instead an asynchronous call would be made to AWX directly to fetch the stdout on demand.  In the new ansible-runner design we don't have that option.  For now, in this implementation, we happen to have this information already stored in the `AnsibleRunnerWorkflow`'s associated `MiqTask`, and since we have a relationship between the `ServiceAnsiblePlaybook`, and the `MiqTask`, we can get the data directly from the database.  We may not want to store this information in the MiqTask permanently, so a better design might be need which I'll elaborate on in the *Ansible stdout* section

The `stdout` is extracted from the stored json records, however it has ANSI character codes for terminal colors embedded.  In the previous implementation, one could ask AWX for the HTML version, but we don't have that in this implementation.  So, instead we use the [`terminal`](https://rubygems.org/gems/terminal) ruby gem, which converts the raw terminal output to HTML replacing ANSI escape sequences with css classes.  For this PoC, I've use the default CSS file that comes with the `terminal` gem, which styles the HTML by wrapping it in a div and scoping that style to the wrapper div.  We will likely want the UI team to have the freedom to style this directly, so instead we can forego the built-in CSS for styles directly in our ManageIQ stylesheets.

</details>

<details>
<summary>Installing ansible-runner</summary>

- On Mac

```sh
brew install ansible python
pip3 install ansible-runner
source /usr/local/Cellar/ansible/2.7.10/libexec/bin/activate && pip3 install psutil && deactivate
```

- On Fedora/CentOS

```sh
sudo wget -O /etc/yum.repos.d/ansible-runner.repo https://releases.ansible.com/ansible-runner/ansible-runner.el7.repo
sudo dnf install ansible-runner
```

</details>

<details>
<summary>git repo management</summary>

@mkanoor and I had started on a federated git repo management design back when we had the idea that the automate models would work better stored in git repos, thus allowing us to run them at any point in time as well as for history tracking, auditing, and reverting capabilities.

The premise was that an appliance would be given the `git_owner` role, which would behave much like the `db_owner` role.  This appliance would allow internet access and thus could clone from public locations like github and/or private git instances.  A record would be put into the `git_repositories` table, so that if we needed to failover the appliance we could re-clone.

All other appliances, if they needed to access something about the git repository, would `git clone/fetch` from the appliance with the `git_owner` role.  This would allow non-internet connected appliances to get at the data in an on-demand fashion.

Some of these classes already exist, such as the GitRepository, GitReference, GitBranch, and GitTag models, as well as the GitWorktree class which manages the on-disk repositories using the `rugged` gem.

The work that still needs to occur is to
- complete these classes
- expose the git protocol from the appliance, likely through Apache, but with some sort of server to server authentication (perhaps similar to how we do `MiqServer.api_system_auth_token_for_region`?)
- have a way to identify the appliance with the git_owner role, likely in a similar fashion to `MiqRegion#remote_ui_miq_server`

Once these are completed, we can ensure a git repo by checking if our on-disk git exists, to which we can git clone from the git_owner appliance, or if it already exists but is not up-to-date (checked by  comparing to the expected SHA stored in the git_repositories table), then git fetching from the git_owner appliance.

Additionally, this would allow us to support things like "Update on Launch", because we would know the expected SHA for launching and can ensure we use that SHA, so when doing an Update on Launch we git fetch first and update the expected SHA.

Extra-bonus, since all of this is done, @mkanoor and I will be able to realize our git-based automate design :smile:

</details>

<details>
<summary>Seeding</summary>

I'm not sure we need to seed any more than what's in the PR (i.e. default credentials for "localhost").  The original code had to create defaults for a number of things in order to please AWX, but those aren't necessarily needed for the new implementation. Even so, we need to research each one of those. (cc @carbonin)

</details>

<details>
<summary>Ansible stdout</summary>

In this implementation ansible stdout is stored in the `MiqTask` *and* it's associated `AnsibleRunnerWorkflow` job. (cc @agrare)  These stdouts can get really big, so it's probably best to only have it stored once.  We probably also do not want to store it in MiqTask, as that class could get cleaned up eventually, so it's probably better to hang a binary_blob entry off of the `ServiceAnsiblePlaybook` instance.

Another complication here is how the UI is implemented, since this was originally a special casing for asynchronously fetching the stdout from AWX on-demand.  In the original implementation, the [backend code](https://github.com/ManageIQ/manageiq/blob/0c2c1cc73995ba7f4ea38a01b07f19bc83d30524/app/models/manageiq/providers/embedded_ansible/automation_manager/job.rb#L7-L22) would start a special MiqTask specifically to get the output as HTML, and temporarily store it in the task.  Then, [the UI would wait_for_task](https://github.com/ManageIQ/manageiq-ui-classic/blob/bf21a6e5981729cffd4f4449427c0532fa11cc12/app/assets/javascripts/components/ansible-raw-stdout.js#L18-L32), and when it was done delete the MiqTask.

None of this is needed anymore, and I think the backend code could be changed such that when the `AnsibleRunnerWorkflow` is completed, the data is extracted from the MiqTask, and stored as a binary_blob.  Later, when the UI asks for the output, no MiqTask is needed as the data is already in the database and can just be served directly.  Even better, this can probably be done as a normal controller action, where the controller just asks the model for the raw output and the TerminalToHtml call is done in the controller (since that's the more logical place to convert raw data to presentation HTML).

</details>

<details>
<summary>Automate methods that are playbooks directly (without the service/service catalog)</summary>

Automate methods that are playbooks directly can use the AnsiblePlaybookWorkflow directly.  Unlike the Service modeling which had its own execute and check_completed callouts, the automate methods do not.

</details>

---
#### TODO

<details>
<summary>Credential management</summary>

TODO

- run pahse, only have a single credential type, allowing the user to define how to map credential details to playbook env vars and/or extra vars.  Then we can get rid of the specialized types.  Slight alternative have 2 types, a mappable type and a key pair type where the latter would map to SSH machine creds.

- crawl phase (or ran out of time phase) we can keep the specialized types, and do the mapping in code a la 
  - https://github.com/ansible/awx/blob/3f73176ef2ae0d952b02e579974c60b807ec172b/awx/main/models/credential/__init__.py#L654
  - https://github.com/ansible/awx/blob/3f73176ef2ae0d952b02e579974c60b807ec172b/awx/main/models/credential/injectors.py

This section will likely need UI work.

</details>

<details>
<summary>Some settings in the service, such as logging, verbosity</summary>

TODO
</details>

<details>
<summary>Using the embedded_ansible or perhaps automate role</summary>

TODO
</details>

<details>
<summary>Upgrades</summary>

TODO
</details>

<details>
<summary>Tests</summary>

TODO
</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] EmbeddedAnsible with ansible-runner-based implementation #45

General approach

Repositories

Service Template

Service execute

Launching ansible-runner

Service check_completed

Services page

TODO

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] EmbeddedAnsible with ansible-runner-based implementation #45

Description

General approach

Repositories

Service Template

Service execute

Launching ansible-runner

Service check_completed

Services page

TODO

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions