Bug #2928

slapd Upstart status is out of control if BDB is corrupted

Added by Davide Principi almost 5 years ago. Updated almost 5 years ago.

Status:CLOSEDStart date:
Priority:NormalDue date:
Assignee:-% Done:

100%

Category:nethserver-directory
Target version:v6.5
Security class: Resolution:
Affected version:v6.5-final NEEDINFO:No

Description

A temporary power loss left a BDB log file corrupted. The db_recover command fixes it, but a problem on restarting slapd persists.

I tried to reproduce the same condition manually:
  • warning make a backup of your BDB files
  •   cd /var/lib/ldap
      mkdir backup
      cp __db.00* log.* backup/
      for F in __db.* log.*; do dd if=/dev/urandom of=$F count=10; done
      ldapsearch -Y EXTERNAL uid=admin
    
  • BOOM
  • In /var/log/messages:
    Oct 24 17:04:28 localhost kernel: slapd[15093] general protection ip:7f9f94c525c8 sp:7f9f90f7c310 error:0 in libdb-4.7.so[7f9f94c29000+16f000]
    ldap_result: Can't contact LDAP server (-1)
    [root@davidep2 ldap]# Oct 24 17:04:28 localhost init: slapd main process (15087) killed by SEGV signal
    Oct 24 17:04:28 localhost init: slapd main process ended, respawning
    Oct 24 17:04:28 localhost nslcd[15099]: caught signal SIGTERM (15), shutting down
    Oct 24 17:04:28 localhost nslcd[15099]: version 0.7.5 bailing out
    Oct 24 17:04:28 localhost init: nslcd main process (15099) terminated with status 1
    Oct 24 17:04:29 localhost init: slapd main process (15116) terminated with status 1
    Oct 24 17:04:29 localhost init: slapd main process ended, respawning
    
  • On root's console:
    [root@davidep2 ldap]# ps axf
    [...]
    15117 ?        Ss     0:00 /bin/sh -e /dev/fd/10
    15122 ?        S      0:00  \_ sleep 3
    [root@davidep2 ldap]# ldapsearch -Y EXTERNAL uid=admin
    ldap_sasl_interactive_bind_s: Can't contact LDAP server (-1)
    [root@davidep2 ldap]# status slapd
    slapd respawn/post-start, (post-start) process 15117
    

The slapd upstart task is blocked in respawn/post-start state. It persists into an infinite loop of ldapwhoami commands. See slapd.conf, line 51.


Related issues

Related to NethServer 6 - Enhancement #2785: Drop TCP wrappers hosts.allow hosts.deny templates CLOSED

Associated revisions

Revision 90a5d20f
Added by Davide Principi almost 5 years ago

hosts.allow/deny templates: removed slapd fragment. Refs #2928 #2785

Revision 92acb087
Added by Davide Principi almost 5 years ago

slapd: send "stat" log messages to syslog, in /var/log/messages. Refs #2928

Revision 566dae2a
Added by Davide Principi almost 5 years ago

slapd.conf Upstart script: avoid infinite loop on post-start. Refs #2928

- Kill dangling post-start scripts on pre-stop stanza.
- Stop respawing after 4 attempts in a minute.

Revision 5ac1d1d4
Added by Davide Principi almost 5 years ago

fix_accounts script. Refs #2928

This helper script completes the creation of user and group accounts,
wherever the Accounts DB entry is not present in system databases (see
getent).

Revision 0e1f6444
Added by Davide Principi almost 5 years ago

fix_accounts helper script. Refs #2928

This helper script completes the creation of user and group accounts,
wherever the Accounts DB entry is not present in system databases (see
getent).

Revision a7606cda
Added by Davide Principi almost 5 years ago

Merge branch 'b2928'. Refs #2928

Revision 30a41214
Added by Davide Principi almost 5 years ago

Reverted original slapd/LogLevel value. Refs #2928

Default LogLevel=0

Revision ac17966e
Added by Davide Principi almost 5 years ago

Simplified slapd LogLevel semantics. Refs #2928

- removed rsyslog configuration template
- LogLevel prop value is passed as-is to slapd daemon
- LogLevel default is 0
- rsyslogd is restarted at the end of update event
- log messages are directed to /var/log/slapd
- added logrotate configuration

History

#1 Updated by Davide Principi almost 5 years ago

  • Related to Enhancement #2785: Drop TCP wrappers hosts.allow hosts.deny templates added

#2 Updated by Davide Principi almost 5 years ago

  • Status changed from TRIAGED to ON_DEV
  • Assignee set to Davide Principi
  • % Done changed from 20 to 30

#3 Updated by Davide Principi almost 5 years ago

  • Description updated (diff)

#4 Updated by Davide Principi almost 5 years ago

  • Status changed from ON_DEV to MODIFIED
  • Assignee deleted (Davide Principi)
  • % Done changed from 30 to 60

In branch b2928 (see revision history for details).

#5 Updated by Davide Principi almost 5 years ago

  • Subject changed from Upstart status slapd corrupted BDB to slapd Upstart status is out of control if BDB is corrupted
  • Description updated (diff)

#6 Updated by Davide Principi almost 5 years ago

  • Status changed from MODIFIED to ON_QA
  • % Done changed from 60 to 70

Added also fix_accounts helper script in doc directory (see git changelog for details)

In nethserver-testing:
nethserver-directory-2.0.3-1.7gita7606cd.ns6.noarch.rpm

#7 Updated by Giacomo Sanchietti almost 5 years ago

  • Assignee set to Giacomo Sanchietti

#8 Updated by Giacomo Sanchietti almost 5 years ago

  • Status changed from ON_QA to VERIFIED
  • Assignee deleted (Giacomo Sanchietti)
  • % Done changed from 70 to 90

VERIFIED

After breaking the db, the server will fail to start:

[root@localhost ldap]# start slapd
start: Job failed to start
[root@localhost ldap]# 

If the db is good the server can be correctly started/restarted/stopped.

#9 Updated by Giacomo Sanchietti almost 5 years ago

  • Status changed from VERIFIED to ON_QA
  • % Done changed from 90 to 70

#10 Updated by Giacomo Sanchietti almost 5 years ago

  • Status changed from ON_QA to TRIAGED
  • % Done changed from 70 to 20
Back to triaged just to change this:
  • remove patch for syslog
  • default LogLevel to 0 is good

#11 Updated by Davide Principi almost 5 years ago

  • Status changed from TRIAGED to ON_DEV
  • Assignee set to Davide Principi
  • % Done changed from 20 to 30

#12 Updated by Davide Principi almost 5 years ago

  • Status changed from ON_DEV to MODIFIED
  • Assignee deleted (Davide Principi)
  • % Done changed from 30 to 60

See commit message for details

#13 Updated by Davide Principi almost 5 years ago

  • Status changed from MODIFIED to ON_QA
  • % Done changed from 60 to 70

In nethserver-testing:
nethserver-directory-2.0.3-1.10git4d83597.ns6.noarch.rpm

#14 Updated by Giacomo Sanchietti almost 5 years ago

  • Assignee set to Giacomo Sanchietti

#15 Updated by Giacomo Sanchietti almost 5 years ago

  • Status changed from ON_QA to VERIFIED
  • Assignee deleted (Giacomo Sanchietti)
  • % Done changed from 70 to 90

VERIFIED

Default configuration doesn't log anything, after setting LogLevel to 256, all logs are inside /var/log/slapd.

#16 Updated by Giacomo Sanchietti almost 5 years ago

  • Status changed from VERIFIED to CLOSED
  • % Done changed from 90 to 100
Released in nethserver-updates:
  • nethserver-directory-2.0.4-1.ns6.noarch.rpm

Also available in: Atom PDF