Skip to content

CA-426228: Fix race condition in run_xapi_async_tasks#235

Open
stephenchengCloud wants to merge 1 commit intomasterfrom
private/stephenche/CA-426228
Open

CA-426228: Fix race condition in run_xapi_async_tasks#235
stephenchengCloud wants to merge 1 commit intomasterfrom
private/stephenche/CA-426228

Conversation

@stephenchengCloud
Copy link
Copy Markdown
Collaborator

When multiple async xapi tasks are fired (e.g. VM.start_on), xapi may complete and garbage-collect a task before the polling loop reads its status. This causes task.get_status() to throw HANDLE_INVALID, crashing the test even though the operation succeeded.

2026-04-14 12:17:04,855 auto-cert-kit: ERROR testbase.py:182 Test Case Failure: test_tx_throughput
2026-04-14 12:17:04,856 auto-cert-kit: DEBUG testbase.py:183 Traceback (most recent call last):
File "/opt/xensource/packages/files/auto-cert-kit/testbase.py", line 143, in run_test
res = getattr(self, test)(self.session)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/xensource/packages/files/auto-cert-kit/network_tests.py", line 766, in test_tx_throughput
return self._run_test(session, direction)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/xensource/packages/files/auto-cert-kit/network_tests.py", line 1151, in _run_test
self.ops_test(session, vm_list)
File "/opt/xensource/packages/files/auto-cert-kit/network_tests.py", line 1325, in ops_test
start_droid_vms(session, [(master_ref, vm) for vm in vms])
File "/opt/xensource/packages/files/auto-cert-kit/utils.py", line 1988, in start_droid_vms
run_xapi_async_tasks(session,
File "/opt/xensource/packages/files/auto-cert-kit/utils.py", line 1716, in run_xapi_async_tasks
status = session.xenapi.task.get_status(ref)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/XenAPI.py", line 317, in call
return self.__send(self.__name, args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/XenAPI.py", line 198, in xenapi_request
result = _parse_result(getattr(self, methodname)(*full_params))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/XenAPI.py", line 292, in _parse_result
raise Failure(result['ErrorDescription'])
XenAPI.Failure: ['HANDLE_INVALID', 'task', 'OpaqueRef:f5f33026-684e-9a42-29f3-533a69e56998']

When multiple async xapi tasks are fired (e.g. VM.start_on), xapi may
complete and garbage-collect a task before the polling loop reads its
status. This causes task.get_status() to throw HANDLE_INVALID, crashing
the test even though the operation succeeded.

Signed-off-by: Stephen Cheng <stephen.cheng@citrix.com>
@stephenchengCloud
Copy link
Copy Markdown
Collaborator Author

After the fix, tested in the same machine:

[root@perfuk-20-01d ~]# cat results.txt
Tests that passed:
network_tests.BondingTestClass.test_nic_bond_active_backup
network_tests.BondingTestClass.test_nic_bond_balance_slb
network_tests.Dom0PIFParamTestClass1.test_rx_throughput
network_tests.Dom0PIFParamTestClass1.test_tx_throughput
network_tests.Dom0PIFParamTestClass2.test_rx_throughput
network_tests.Dom0PIFParamTestClass2.test_tx_throughput
network_tests.Dom0PIFParamTestClass3.test_rx_throughput
network_tests.Dom0PIFParamTestClass3.test_tx_throughput
network_tests.Dom0VMIperfTestClass.test_rx_throughput
network_tests.Dom0VMIperfTestClass.test_tx_throughput
network_tests.GROOffloadTestClass.test_offload_config
network_tests.InterHostSRIOVTestClass.test_rx_throughput
network_tests.InterHostSRIOVTestClass.test_tx_throughput
network_tests.IntraHostSRIOVTestClass1.test_rx_throughput
network_tests.IntraHostSRIOVTestClass1.test_tx_throughput
network_tests.IntraHostSRIOVTestClass2.test_rx_throughput
network_tests.IntraHostSRIOVTestClass2.test_tx_throughput
network_tests.IperfTestClass.test_rx_throughput
network_tests.IperfTestClass.test_tx_throughput
network_tests.MulticastTestClass.test_rx_throughput
network_tests.MulticastTestClass.test_tx_throughput
network_tests.PIFParamTestClass.test_rx_throughput
network_tests.PIFParamTestClass.test_tx_throughput
Tests that failed: (expected. not related to the issue)
network_tests.MTUPingTestClass.test_ping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant