Skip to content

use rmutex inside of all repeated implementation#6320

Merged
xiaoxiang781216 merged 1 commit into
apache:masterfrom
anjiahao1:tihuan_rmutex
May 30, 2022
Merged

use rmutex inside of all repeated implementation#6320
xiaoxiang781216 merged 1 commit into
apache:masterfrom
anjiahao1:tihuan_rmutex

Conversation

@anjiahao1

Copy link
Copy Markdown
Contributor

Signed-off-by: anjiahao anjiahao@xiaomi.com

Impact

maybe Slightly reduced szie and increased speed

Testing

CI

Comment thread drivers/1wire/1wire.c Outdated
Comment thread drivers/1wire/1wire.c Outdated
Comment thread drivers/1wire/1wire.c Outdated
@anjiahao1 anjiahao1 force-pushed the tihuan_rmutex branch 4 times, most recently from ca0bc6f to 133f87a Compare May 25, 2022 05:20
Comment thread drivers/rptun/rptun.c Outdated
Comment thread drivers/rptun/rptun.c Outdated
Comment thread net/route/net_fileroute.c
Comment thread libs/libc/stdio/lib_libfilesem.c
Comment thread libs/libc/netdb/lib_dnsinit.c
Comment thread include/nuttx/mutex.h Outdated
Comment thread include/nuttx/mutex.h
Comment thread libs/libc/modlib/modlib_registry.c
Comment thread fs/inode/fs_inode.c Outdated
Comment thread fs/aio/aio_initialize.c
Comment thread drivers/syslog/syslog_device.c Outdated
Comment thread drivers/syslog/syslog_device.c Outdated
@anjiahao1 anjiahao1 force-pushed the tihuan_rmutex branch 2 times, most recently from e9db895 to 9aeddfe Compare May 25, 2022 09:11
Comment thread drivers/usbhost/usbhost_max3421e.c Outdated
Comment thread drivers/rptun/rptun.c Outdated
Comment thread drivers/rptun/rptun.c Outdated
Comment thread fs/tmpfs/fs_tmpfs.c Outdated
Comment thread include/nuttx/fs/fs.h
@anjiahao1 anjiahao1 force-pushed the tihuan_rmutex branch 3 times, most recently from aa0d037 to 3d021b1 Compare May 25, 2022 15:59

@pkarashchenko pkarashchenko left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks good. Just few comments left

Comment thread include/nuttx/mutex.h Outdated
@anjiahao1 anjiahao1 force-pushed the tihuan_rmutex branch 2 times, most recently from 4614cf6 to dfd8bae Compare May 26, 2022 02:27
@anjiahao1

Copy link
Copy Markdown
Contributor Author

image
What is the difference between _SEM_DOSTRY andnxrmutex_destroy

@pkarashchenko

pkarashchenko commented May 26, 2022

Copy link
Copy Markdown
Contributor

image What is the difference between _SEM_DOSTRY andnxrmutex_destroy

#if !defined(CONFIG_BUILD_FLAT) && defined(__KERNEL__)
#  define _SEM_DESTROY(s)       nxsem_destroy(s)
#else
#  define _SEM_DESTROY(s)       sem_destroy(s)
#endif

So the libc layer should be _SEM_DESTROY and not nxsem_destroy
Or we need to change mutex.h to use _SEM_ instead of nxsem_

Comment thread include/nuttx/mutex.h Outdated
Comment thread include/nuttx/mutex.h Outdated
Comment thread include/nuttx/mutex.h
Comment thread libs/libc/modlib/modlib_registry.c
@anjiahao1

Copy link
Copy Markdown
Contributor Author

image
maybe we need MUTEX_INITIALIZER -> NXMUTEX_INITIALIZER

@pkarashchenko

Copy link
Copy Markdown
Contributor

Wow, that is something new. Need to check mainline builds

Comment thread libs/libc/modlib/modlib_registry.c
Comment thread include/nuttx/mutex.h Outdated
Comment thread libs/libc/stdio/lib_libfilesem.c
Comment thread net/route/net_fileroute.c
@anjiahao1 anjiahao1 force-pushed the tihuan_rmutex branch 4 times, most recently from 0277763 to d9d5704 Compare May 27, 2022 11:14
Comment thread libs/libc/stdio/lib_libfilesem.c Outdated
Comment thread include/nuttx/mutex.h Outdated
Comment thread include/nuttx/mutex.h Outdated
Comment thread include/nuttx/mutex.h Outdated
Comment thread include/nuttx/mutex.h Outdated
Comment thread include/nuttx/mutex.h Outdated
@pkarashchenko

Copy link
Copy Markdown
Contributor

@xiaoxiang781216 I see that we are attempting to solve the mutex issue with "small blood", but IMO we need to look deeper. We have priority inheritance in place that should be applied to mutex only, but is implemented at semaphore level (that I believe is totally wrong), so we have holder for mutex and holder for semaphore in case of priority inheritance that at the moment of time when mutex is held are the same process IDs.
I truly believe that we need to implement nxmutex similar to nxsem as a separate kernel object and locate priority inheritance there, so we will "kill two rabbits with a single shot". Until then we will suffer from dozens of corner cases and complicating logic to cover it.

@xiaoxiang781216

xiaoxiang781216 commented May 27, 2022

Copy link
Copy Markdown
Contributor

@xiaoxiang781216 I see that we are attempting to solve the mutex issue with "small blood", but IMO we need to look deeper. We have priority inheritance in place that should be applied to mutex only, but is implemented at semaphore level (that I believe is totally wrong), so we have holder for mutex and holder for semaphore in case of priority inheritance that at the moment of time when mutex is held are the same process IDs. I truly believe that we need to implement nxmutex similar to nxsem as a separate kernel object and locate priority inheritance there, so we will "kill two rabbits with a single shot". Until then we will suffer from dozens of corner cases and complicating logic to cover it.

Yes, I agree that it isn't good to use semaphore both for the signal and lock. That's why we add nxmutex_t and nxrmutex_t:

  1. Implement the recursive lock(nxmutex_t) and migrate all usage to it
  2. Migrate all unrecursive sem lock/unlock to mutex_t

With the above change, we can get the clean code base that:

  1. All code(except the implementation of [pthread_|nxr]mutex_t) use [nx]sem_t for the signal only
  2. All code use [pthread_|nxr]mutex_t as the lock

The next steps are:

  1. Change the default protocol from SEM_PRIO_INHERIT to SEM_PRIO_NONE
  2. Remove sem_setprotocol(&sem, SEM_PRIO_NONE) from the code base
  3. Add sem_setprotocol(&sem, SEM_PRIO_INHERIT) to the implementation of [pthread_|nxr]mutex_t)

After this, all priority inheritance should get fixed. Of course, we can move the implementation of priority inheritance from sem_t to mutex_t, so people never get the unexpected behavior with semaphore.

@pkarashchenko

Copy link
Copy Markdown
Contributor

Sounds like a plan :)

@pkarashchenko pkarashchenko self-requested a review May 27, 2022 19:06
@anjiahao1 anjiahao1 force-pushed the tihuan_rmutex branch 2 times, most recently from d17c862 to 3310dec Compare May 29, 2022 14:33
Comment thread include/nuttx/mutex.h
Comment thread include/nuttx/mutex.h Outdated
Comment thread fs/tmpfs/fs_tmpfs.c
Comment thread fs/tmpfs/fs_tmpfs.c Outdated
Comment thread include/nuttx/mutex.h Outdated
@anjiahao1 anjiahao1 force-pushed the tihuan_rmutex branch 2 times, most recently from a21b691 to 6ac80a8 Compare May 30, 2022 08:03
Signed-off-by: anjiahao <anjiahao@xiaomi.com>
Comment on lines -130 to -164
static inline int syslog_dev_takesem(FAR struct syslog_dev_s *syslog_dev)
{
pid_t me = getpid();
int ret;

/* Does this thread already hold the semaphore? That could happen if
* we were called recursively, i.e., if the logic kicked off by
* file_write() where to generate more debug output. Return an
* error in that case.
*/

if (syslog_dev->sl_holder == me)
{
/* Return an error (instead of deadlocking) */

return -EWOULDBLOCK;
}

/* Either the semaphore is available or is currently held by another
* thread. Wait for it to become available.
*/

ret = nxsem_wait(&syslog_dev->sl_sem);
if (ret < 0)
{
return ret;
}

/* We hold the semaphore. We can safely mark ourself as the holder
* of the semaphore.
*/

syslog_dev->sl_holder = me;
return OK;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiaoxiang781216 I observe deadlock after I migrate my application to the latest NuttX master. I narrow down the issue and see that it happens when multiple tasks try to use syslog. I have a syslog configuration with 2 channels:

  1. Channel attached to console
  2. Channel attached to file on /mnt/sdcard0/nuttx.log
    When the issue happens I observe next picture:
    Screenshot from 2022-06-12 20-09-50

I've noticed that before this change syslog_dev_takesem returned return -EWOULDBLOCK; with a comment /* Return an error (instead of deadlocking) */ and seems like this functionality has been broken now.

In general I think that we rush too much with replacing of semaphores with mutexes. For example the FLAT build functioning changed after this change since all nxsem_ API calls became replaced with sem_ API calls and introduced unnecessary cancellation points in kernel. I'm not sure about the side effect of this, but still investigating.
I think we need to revert part of the changes from this PR especially syslog part and part related to replacement in libc and get back nxmutex_ to use nxsem_ APIs only and use it in kernel part only for now. Then make a step-by-step change to user space.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope I did all correctly in #6414

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace all place which use semaphore as lock with mutex wrapper

4 participants