Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fallback_ips.yml exits early when there is an unreachable host in the inventory #10993

Open
Rickkwa opened this issue Mar 11, 2024 · 0 comments · May be fixed by #11006
Open

fallback_ips.yml exits early when there is an unreachable host in the inventory #10993

Rickkwa opened this issue Mar 11, 2024 · 0 comments · May be fixed by #11006
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@Rickkwa
Copy link
Contributor

Rickkwa commented Mar 11, 2024

What happened?

This is a continuation of #10313.

When roles/kubespray-defaults/tasks/fallback_ips.yml runs on a inventory with an unreachable host, it'll exit the entire play after the setup task, with NO MORE HOSTS LEFT.

What did you expect to happen?

I expect the entire kubespray-defaults role to finish running, but it exits the play after the single task.

How can we reproduce it (as minimally and precisely as possible)?

Minimal inventory

[k8s_cluster:children]
kube_control_plane
kube_node
calico_rr

[kube_control_plane]
k8s1.local  # reachable host

[etcd]
k8s1.local  # reachable host

[kube_node]
k8s3.local  # problematic unreachable host
k8s2.local  # reachable host

[calico_rr]

And then this minimal playbook

- name: Prepare nodes for upgrade
  hosts: k8s_cluster:etcd:calico_rr
  gather_facts: False
  any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
  environment: "{{ proxy_disable_env }}"
  roles:
    - { role: kubespray-defaults }

Execute with ansible-playbook -i hosts.ini bug.yml

OS

Linux 6.5.11-8-pve x86_64
NAME="AlmaLinux"
VERSION="9.3 (Shamrock Pampas Cat)"
ID="almalinux"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.3"
PLATFORM_ID="platform:el9"
PRETTY_NAME="AlmaLinux 9.3 (Shamrock Pampas Cat)"
ANSI_COLOR="0;34"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:almalinux:almalinux:9::baseos"
HOME_URL="https://almalinux.org/"
DOCUMENTATION_URL="https://wiki.almalinux.org/"
BUG_REPORT_URL="https://bugs.almalinux.org/"

ALMALINUX_MANTISBT_PROJECT="AlmaLinux-9"
ALMALINUX_MANTISBT_PROJECT_VERSION="9.3"
REDHAT_SUPPORT_PRODUCT="AlmaLinux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.3"

Version of Ansible

Tried both:

ansible [core 2.15.9]
  config file = /root/kubespray-test/kubespray/ansible.cfg
  configured module search path = ['/root/kubespray-test/kubespray/library']
  ansible python module location = /root/kubespray-test/venv-latest/lib64/python3.9/site-packages/ansible
  ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
  executable location = /root/kubespray-test/venv-latest/bin/ansible
  python version = 3.9.18 (main, Jan  4 2024, 00:00:00) [GCC 11.4.1 20230605 (Red Hat 11.4.1-2)] (/root/kubespray-test/venv-latest/bin/python)
  jinja version = 3.1.3
  libyaml = True

and

ansible [core 2.14.14]
  config file = /root/kubespray-test/kubespray/ansible.cfg
  configured module search path = ['/root/kubespray-test/kubespray/library']
  ansible python module location = /root/kubespray-test/venv-2.14/lib64/python3.9/site-packages/ansible  ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
  executable location = /root/kubespray-test/venv-2.14/bin/ansible
  python version = 3.9.18 (main, Jan  4 2024, 00:00:00) [GCC 11.4.1 20230605 (Red Hat 11.4.1-2)] (/root/kubespray-test/venv-2.14/bin/python)
  jinja version = 3.1.3
  libyaml = True

Version of Python

Python 3.9.18

Version of Kubespray (commit)

66eaba3

Network plugin used

calico

Full inventory with variables

See "How can we reproduce it" section. Just that inventory, no variables.

Command used to invoke ansible

See "How can we reproduce it" section

Output of ansible run

PLAY [Prepare nodes for upgrade] ********************************************************************************************************************************************************************************

TASK [kubespray-defaults : Gather ansible_default_ipv4 from all hosts] ******************************************************************************************************************************************
ok: [k8s1.local] => (item=k8s1.local)
[WARNING]: Unhandled error in Python interpreter discovery for host k8s1.local: Failed to connect to the host via ssh: ssh: connect to host k8s3.local port 22: Connection timed out
failed: [k8s1.local -> k8s3.local] (item=k8s3.local) => {"ansible_loop_var": "item", "item": "k8s3.local", "msg": "Data could not be sent to remote host \"k8s3.local\". Make sure this host can be reached over ssh: ssh: connect to host k8s3.local port 22: Connection timed out\r\n", "unreachable": true}
ok: [k8s1.local -> k8s2.local] => (item=k8s2.local)
fatal: [k8s1.local -> {{ item }}]: UNREACHABLE! => {"changed": false, "msg": "All items completed", "results": [{"ansible_facts": {"ansible_default_ipv4": {"address": "10.88.111.29", "alias": "eth0", "broadcast": "10.88.111.255", "gateway": "10.88.111.254", "interface": "eth0", "macaddress": "bc:24:11:41:88:12", "mtu": 1500, "netmask": "255.255.252.0", "network": "10.88.108.0", "prefix": "22", "type": "ether"}, "discovered_interpreter_python": "/usr/bin/python3"}, "ansible_loop_var": "item", "changed": false, "failed": false, "invocation": {"module_args": {"fact_path": "/etc/ansible/facts.d", "filter": ["ansible_default_ipv4"], "gather_subset": ["!all", "network"], "gather_timeout": 10}}, "item": "k8s1.local"}, {"ansible_loop_var": "item", "item": "k8s3.local", "msg": "Data could not be sent to remote host \"k8s3.local\". Make sure this host can be reached over ssh: ssh: connect to host k8s3.local port 22: Connection timed out\r\n", "unreachable": true}, {"ansible_facts": {"ansible_default_ipv4": {"address": "10.88.111.30", "alias": "eth0", "broadcast": "10.88.111.255", "gateway": "10.88.111.254", "interface": "eth0", "macaddress": "bc:24:11:be:42:a6", "mtu": 1500, "netmask": "255.255.252.0", "network": "10.88.108.0", "prefix": "22", "type": "ether"}, "discovered_interpreter_python": "/usr/bin/python3"}, "ansible_loop_var": "item", "changed": false, "failed": false, "invocation": {"module_args": {"fact_path": "/etc/ansible/facts.d", "filter": ["ansible_default_ipv4"], "gather_subset": ["!all", "network"], "gather_timeout": 10}}, "item": "k8s2.local"}]}
...ignoring

NO MORE HOSTS LEFT **********************************************************************************************************************************************************************************************

PLAY RECAP ******************************************************************************************************************************************************************************************************
k8s1.local                 : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=1

Anything else we need to know

In the PR #10601, it added ignore_unreachable: true. That made it so the Play Recap had ignored=1 instead of unreachable=1. But ultimately it doesn't solve the issue of the play exiting early.

@Rickkwa Rickkwa added the kind/bug Categorizes issue or PR as related to a bug. label Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant