Regular crashes during normal operation #180

Open
opened 2025-11-01 23:12:14 +01:00 by mishi · 12 comments
mishi commented 2025-11-01 23:12:14 +01:00 (Migrated from git.mattzz.de)

After 20-30 min of scaring the pumpkin application crashes with the following error message:

[Oct 31 19:08:00 youngcouple uwsgi[2462]: pumpkin.tools      - INFO: playing back in background /home/pi/halloween/pumpkin/static/s
ounds/TimeOfMyLife-short.mp3
Oct 31 19:08:00 youngcouple uwsgi[2635]: tcgetattr(): Inappropriate ioctl for device
Oct 31 19:08:00 youngcouple uwsgi[2462]: pumpkin.scary_actions.choreography_action - INFO: Getting choreography mode from brainz..
.
Oct 31 19:08:08 youngcouple uwsgi[2462]: Exception in thread Thread-60:
Oct 31 19:08:08 youngcouple uwsgi[2462]: Traceback (most recent call last):
Oct 31 19:08:08 youngcouple uwsgi[2462]:   File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner
Oct 31 19:08:08 youngcouple uwsgi[2462]:     self.run()
Oct 31 19:08:08 youngcouple uwsgi[2462]:   File "/usr/lib/python3.9/threading.py", line 892, in run
Oct 31 19:08:08 youngcouple uwsgi[2462]:     self._target(*self._args, **self._kwargs)
Oct 31 19:08:08 youngcouple uwsgi[2462]:   File "./pumpkin/scary_actions/youngcouple_action.py", line 193, in dance
Oct 31 19:08:08 youngcouple uwsgi[2462]:     port_io.set_output(db, 1, 0)
Oct 31 19:08:08 youngcouple uwsgi[2462]:   File "./pumpkin/port_io.py", line 26, in set_output
Oct 31 19:08:08 youngcouple uwsgi[2462]:     set_vardb(db, str(port_id), value)
Oct 31 19:08:08 youngcouple uwsgi[2462]:   File "./pumpkin/database.py", line 14, in set_vardb
Oct 31 19:08:08 youngcouple uwsgi[2462]:     db.execute('update VARIABLE set VALUE=? where NAME=?', (value, name))
Oct 31 19:08:08 youngcouple uwsgi[2462]: sqlite3.OperationalError: attempt to write a readonly database
Oct 31 19:09:00 youngcouple uwsgi[2462]: pumpkin.scary_actions.choreography_action - INFO: Performing choreography action dance at
 0
Oct 31 19:09:00 youngcouple uwsgi[2462]: Exception in thread Thread-62:
Oct 31 19:09:00 youngcouple uwsgi[2462]: Traceback (most recent call last):
Oct 31 19:09:00 youngcouple uwsgi[2462]:   File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner
Oct 31 19:09:00 youngcouple uwsgi[2462]:     self.run()
Oct 31 19:09:00 youngcouple uwsgi[2462]:   File "/usr/lib/python3.9/threading.py", line 892, in run
Oct 31 19:09:00 youngcouple uwsgi[2462]:     self._target(*self._args, **self._kwargs)
Oct 31 19:09:00 youngcouple uwsgi[2462]:   File "./pumpkin/scary_actions/youngcouple_action.py", line 170, in dance
Oct 31 19:09:00 youngcouple uwsgi[2462]:     if get_vardb(db, 'is_scaring') == '0':
Oct 31 19:09:00 youngcouple uwsgi[2462]:   File "./pumpkin/database.py", line 18, in get_vardb
Oct 31 19:09:00 youngcouple uwsgi[2462]:     cur = db.execute('select VALUE from VARIABLE where NAME=?', (name,))
Oct 31 19:09:00 youngcouple uwsgi[2462]: sqlite3.OperationalError: no such table: VARIABLE
Oct 31 19:09:00 youngcouple uwsgi[2462]: pumpkin.scary_actions.choreography_action - INFO: Getting choreography mode from brainz..
.
Oct 31 19:09:00 youngcouple uwsgi[2462]: Exception in thread Thread-4:
Oct 31 19:09:00 youngcouple uwsgi[2462]: Traceback (most recent call last):
Oct 31 19:09:00 youngcouple uwsgi[2462]:   File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner
Oct 31 19:09:00 youngcouple uwsgi[2462]:     self.run()
Oct 31 19:09:00 youngcouple uwsgi[2462]:   File "/usr/lib/python3.9/threading.py", line 892, in run
Oct 31 19:09:00 youngcouple uwsgi[2462]:     self._target(*self._args, **self._kwargs)
Oct 31 19:09:00 youngcouple uwsgi[2462]:   File "./pumpkin/scary_actions/choreography_action.py", line 196, in choreography_demon
Oct 31 19:09:00 youngcouple uwsgi[2462]:     config_demon()
Oct 31 19:09:00 youngcouple uwsgi[2462]:   File "./pumpkin/scary_actions/choreography_action.py", line 95, in config_demon
Oct 31 19:09:00 youngcouple uwsgi[2462]:     if previous_server_mode != server_mode or get_vardb(db, 'choreography_mode') == 'parked':
Oct 31 19:09:00 youngcouple uwsgi[2462]:   File "./pumpkin/database.py", line 18, in get_vardb
Oct 31 19:09:00 youngcouple uwsgi[2462]:     cur = db.execute('select VALUE from VARIABLE where NAME=?', (name,))
Oct 31 19:09:00 youngcouple uwsgi[2462]: sqlite3.OperationalError: no such table: VARIABLE

After 20-30 min of scaring the pumpkin application crashes with the following error message: ``` [Oct 31 19:08:00 youngcouple uwsgi[2462]: pumpkin.tools - INFO: playing back in background /home/pi/halloween/pumpkin/static/s ounds/TimeOfMyLife-short.mp3 Oct 31 19:08:00 youngcouple uwsgi[2635]: tcgetattr(): Inappropriate ioctl for device Oct 31 19:08:00 youngcouple uwsgi[2462]: pumpkin.scary_actions.choreography_action - INFO: Getting choreography mode from brainz.. . Oct 31 19:08:08 youngcouple uwsgi[2462]: Exception in thread Thread-60: Oct 31 19:08:08 youngcouple uwsgi[2462]: Traceback (most recent call last): Oct 31 19:08:08 youngcouple uwsgi[2462]: File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner Oct 31 19:08:08 youngcouple uwsgi[2462]: self.run() Oct 31 19:08:08 youngcouple uwsgi[2462]: File "/usr/lib/python3.9/threading.py", line 892, in run Oct 31 19:08:08 youngcouple uwsgi[2462]: self._target(*self._args, **self._kwargs) Oct 31 19:08:08 youngcouple uwsgi[2462]: File "./pumpkin/scary_actions/youngcouple_action.py", line 193, in dance Oct 31 19:08:08 youngcouple uwsgi[2462]: port_io.set_output(db, 1, 0) Oct 31 19:08:08 youngcouple uwsgi[2462]: File "./pumpkin/port_io.py", line 26, in set_output Oct 31 19:08:08 youngcouple uwsgi[2462]: set_vardb(db, str(port_id), value) Oct 31 19:08:08 youngcouple uwsgi[2462]: File "./pumpkin/database.py", line 14, in set_vardb Oct 31 19:08:08 youngcouple uwsgi[2462]: db.execute('update VARIABLE set VALUE=? where NAME=?', (value, name)) Oct 31 19:08:08 youngcouple uwsgi[2462]: sqlite3.OperationalError: attempt to write a readonly database Oct 31 19:09:00 youngcouple uwsgi[2462]: pumpkin.scary_actions.choreography_action - INFO: Performing choreography action dance at 0 Oct 31 19:09:00 youngcouple uwsgi[2462]: Exception in thread Thread-62: Oct 31 19:09:00 youngcouple uwsgi[2462]: Traceback (most recent call last): Oct 31 19:09:00 youngcouple uwsgi[2462]: File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner Oct 31 19:09:00 youngcouple uwsgi[2462]: self.run() Oct 31 19:09:00 youngcouple uwsgi[2462]: File "/usr/lib/python3.9/threading.py", line 892, in run Oct 31 19:09:00 youngcouple uwsgi[2462]: self._target(*self._args, **self._kwargs) Oct 31 19:09:00 youngcouple uwsgi[2462]: File "./pumpkin/scary_actions/youngcouple_action.py", line 170, in dance Oct 31 19:09:00 youngcouple uwsgi[2462]: if get_vardb(db, 'is_scaring') == '0': Oct 31 19:09:00 youngcouple uwsgi[2462]: File "./pumpkin/database.py", line 18, in get_vardb Oct 31 19:09:00 youngcouple uwsgi[2462]: cur = db.execute('select VALUE from VARIABLE where NAME=?', (name,)) Oct 31 19:09:00 youngcouple uwsgi[2462]: sqlite3.OperationalError: no such table: VARIABLE Oct 31 19:09:00 youngcouple uwsgi[2462]: pumpkin.scary_actions.choreography_action - INFO: Getting choreography mode from brainz.. . Oct 31 19:09:00 youngcouple uwsgi[2462]: Exception in thread Thread-4: Oct 31 19:09:00 youngcouple uwsgi[2462]: Traceback (most recent call last): Oct 31 19:09:00 youngcouple uwsgi[2462]: File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner Oct 31 19:09:00 youngcouple uwsgi[2462]: self.run() Oct 31 19:09:00 youngcouple uwsgi[2462]: File "/usr/lib/python3.9/threading.py", line 892, in run Oct 31 19:09:00 youngcouple uwsgi[2462]: self._target(*self._args, **self._kwargs) Oct 31 19:09:00 youngcouple uwsgi[2462]: File "./pumpkin/scary_actions/choreography_action.py", line 196, in choreography_demon Oct 31 19:09:00 youngcouple uwsgi[2462]: config_demon() Oct 31 19:09:00 youngcouple uwsgi[2462]: File "./pumpkin/scary_actions/choreography_action.py", line 95, in config_demon Oct 31 19:09:00 youngcouple uwsgi[2462]: if previous_server_mode != server_mode or get_vardb(db, 'choreography_mode') == 'parked': Oct 31 19:09:00 youngcouple uwsgi[2462]: File "./pumpkin/database.py", line 18, in get_vardb Oct 31 19:09:00 youngcouple uwsgi[2462]: cur = db.execute('select VALUE from VARIABLE where NAME=?', (name,)) Oct 31 19:09:00 youngcouple uwsgi[2462]: sqlite3.OperationalError: no such table: VARIABLE ```
mishi commented 2025-11-01 23:34:44 +01:00 (Migrated from git.mattzz.de)

One suggestion from chatgpt is:

Sharing one SQLite connection across threads
Using one db object in multiple threads can corrupt state or trip weird errors. Even check_same_thread=False doesn’t make it truly safe.
Fix: create a separate connection per thread/request, and close it properly.

One suggestion from chatgpt is: > Sharing one SQLite connection across threads >Using one db object in multiple threads can corrupt state or trip weird errors. Even check_same_thread=False doesn’t make it truly safe. >Fix: create a separate connection per thread/request, and close it properly.
mishi commented 2025-11-02 01:03:51 +01:00 (Migrated from git.mattzz.de)

Here's the entire log file of the young couple

Here's the entire log file of the young couple
mishi commented 2025-11-02 10:56:03 +01:00 (Migrated from git.mattzz.de)

My theory is that the slower responses of the server when several hosts work in parallel and the Wifi is burdened by the visitors and that leads to threading problems with SQLite.

The default of requests.get is no timeout - which may worsen the issue. A timeout definitely needs to be added.

I will also test if it works to create a separate connection to the file. It may be better however to use a PostgreSQL db.

I'll first add logging and fire up a few hosts in parallel and see how long those requests take without any changes.

My theory is that the slower responses of the server when several hosts work in parallel and the Wifi is burdened by the visitors and that leads to threading problems with SQLite. The default of requests.get is no timeout - which may worsen the issue. A timeout definitely needs to be added. I will also test if it works to create a separate connection to the file. It may be better however to use a PostgreSQL db. I'll first add logging and fire up a few hosts in parallel and see how long those requests take without any changes.
mishi commented 2025-11-02 11:06:08 +01:00 (Migrated from git.mattzz.de)

A question is if we need a database at all. With those little data we have we could also keep them in a thread that maintains a data structure in memory and offers access functions. A postgreSQL db might be a problem for Raspberry Pi Zeros...

A question is if we need a database at all. With those little data we have we could also keep them in a thread that maintains a data structure in memory and offers access functions. A postgreSQL db might be a problem for Raspberry Pi Zeros...
mishi commented 2025-11-02 11:26:01 +01:00 (Migrated from git.mattzz.de)

I will build up an experiment: Add a memoryDB module and use that as database instead in database.py...

This is what ChatGPT suggests as memoryDB:

'memorydb.py
import threading

class MemoryDB:

def __init__(self):
    self._lock = threading.RLock()
    self._data = {}

def set(self, name: str, value: str):
    """Create or update a variable."""
    with self._lock:
        self._data[name] = str(value)

def get(self, name: str, default=None) -> str:
    """Retrieve a variable by name, or return default."""
    with self._lock:
        return self._data.get(name, default)

def update(self, name: str, value: str):
    """Update only if key exists."""
    with self._lock:
        if name in self._data:
            self._data[name] = str(value)
        else:
            raise KeyError(f"No such variable: {name}")

def clear(self):
    """Remove all key-value pairs from the database."""
    with self._lock:
        self._data.clear()

def items(self):
    """Return a snapshot (copy) of all key-value pairs."""
    with self._lock:
        return dict(self._data)

def __repr__(self):
    with self._lock:
        return f"<MemoryDB {self._data}>"`
I will build up an experiment: Add a memoryDB module and use that as database instead in database.py... This is what ChatGPT suggests as memoryDB: 'memorydb.py import threading class MemoryDB: def __init__(self): self._lock = threading.RLock() self._data = {} def set(self, name: str, value: str): """Create or update a variable.""" with self._lock: self._data[name] = str(value) def get(self, name: str, default=None) -> str: """Retrieve a variable by name, or return default.""" with self._lock: return self._data.get(name, default) def update(self, name: str, value: str): """Update only if key exists.""" with self._lock: if name in self._data: self._data[name] = str(value) else: raise KeyError(f"No such variable: {name}") def clear(self): """Remove all key-value pairs from the database.""" with self._lock: self._data.clear() def items(self): """Return a snapshot (copy) of all key-value pairs.""" with self._lock: return dict(self._data) def __repr__(self): with self._lock: return f"<MemoryDB {self._data}>"`
mattzz commented 2025-11-02 11:43:49 +01:00 (Migrated from git.mattzz.de)

Hm. Glad we found that it is clearly db related.
My understanding is that there is no global context in this kind of web applications, so it's less about persistence of data and more about accessing a global context, shared via db. So in order to share a global context, db usage seemed to be way. However, maybe there are other ways to provide global context.

Regarding in memory storage: The db is already in memory, as far as I remember. The sqlite file is being served from a ram disk to prevent flash corruption due to frequent writes.

If there was a way to make sqlite thread safe, that would be the route I would chose. And refactoring open/close calls. Postgres/Mariadb etc are way too heavy solutions for this problem, imho.

Hm. Glad we found that it is clearly db related. My understanding is that there is no global context in this kind of web applications, so it's less about persistence of data and more about accessing a global context, shared via db. So in order to share a global context, db usage seemed to be way. However, maybe there are other ways to provide global context. Regarding in memory storage: The db is already in memory, as far as I remember. The sqlite file is being served from a ram disk to prevent flash corruption due to frequent writes. If there was a way to make sqlite thread safe, that would be the route I would chose. And refactoring open/close calls. Postgres/Mariadb etc are way too heavy solutions for this problem, imho.
mattzz commented 2025-11-02 11:52:14 +01:00 (Migrated from git.mattzz.de)

On sqlite side of things there are a few improvements possible:

  • make sure sqlite is initialized (and compiled with) thread support (SQLITE_CONFIG_MULTITHREAD)
  • make sure to open a db connection for each thread
  • check sqlite timeout settings
On sqlite side of things there are a few improvements possible: - make sure sqlite is initialized (and compiled with) thread support (SQLITE_CONFIG_MULTITHREAD) - make sure to open a db connection for each thread - check sqlite timeout settings
mishi commented 2025-11-09 13:09:32 +01:00 (Migrated from git.mattzz.de)

I have done a bit of research about flask and the database usage and I think I found an issue with how the data connection is used by the endless task. The database connection is meant to be used by a request and to be destroyed afterwards. However, we now have a number of endless tasks which access the database and write to it. And they open the connection, write to it here and there and never close it. This seems wrong.

I have done a bit of research about flask and the database usage and I think I found an issue with how the data connection is used by the endless task. The database connection is meant to be used by a request and to be destroyed afterwards. However, we now have a number of endless tasks which access the database and write to it. And they open the connection, write to it here and there and never close it. This seems wrong.
mishi commented 2025-11-09 13:13:19 +01:00 (Migrated from git.mattzz.de)

Regarding file system and sd card accesses: The database is stored in /run/shm which is an in-memory location which is lost after power down. So there are no SD card write accesses by database accesses.

Regarding file system and sd card accesses: The database is stored in /run/shm which is an in-memory location which is lost after power down. So there are no SD card write accesses by database accesses.
mishi commented 2025-11-09 14:17:28 +01:00 (Migrated from git.mattzz.de)

I have done a bit of research about flask and the database usage and I think I found an issue with how the data connection is used by the endless task. The database connection is meant to be used by a request and to be destroyed afterwards. However, we now have a number of endless tasks which access the database and write to it. And they open the connection, write to it here and there and never close it. This seems wrong.

Let's identify accesses to the db which deviate from the normal approach of opening the connection for an instant request responses:

  • Until now, we had the "is_scaring" value which was written at the beginning and end of the scary action.
  • The random_timer in rest_api has a loop in which it reads the db
  • choreography_demon has loop in which it reads the db
  • There are some tasks which have been retrieving information by rest requests in discovery: discover_haunted_house and config - which are not called during automatic operation.
  • And now we also have the config_demo which requests data from the brain and writes them to the db as part of the automatic routine.
> I have done a bit of research about flask and the database usage and I think I found an issue with how the data connection is used by the endless task. The database connection is meant to be used by a request and to be destroyed afterwards. However, we now have a number of endless tasks which access the database and write to it. And they open the connection, write to it here and there and never close it. This seems wrong. Let's identify accesses to the db which deviate from the normal approach of opening the connection for an instant request responses: - Until now, we had the "is_scaring" value which was written at the beginning and end of the scary action. - The random_timer in rest_api has a loop in which it reads the db - choreography_demon has loop in which it reads the db - There are some tasks which have been retrieving information by rest requests in discovery: discover_haunted_house and config - which are not called during automatic operation. - And now we also have the config_demo which requests data from the brain and writes them to the db as part of the automatic routine.
mishi commented 2025-11-09 15:09:01 +01:00 (Migrated from git.mattzz.de)

ChatGPT recommends the following:

In Flask request handlers
def get_db(): if 'db' not in g: g.db = sqlite3.connect(DATABASE) return g.db

Each request runs in its own thread, so g.db is its per-thread connection. Flask tears it down afterward.
In background threads

def background_worker(): conn = sqlite3.connect(DATABASE) while True: # Do long or repeating work ...

Each worker opens its own connection once and reuses it until it’s done.

ChatGPT recommends the following: In Flask request handlers `def get_db(): if 'db' not in g: g.db = sqlite3.connect(DATABASE) return g.db` Each request runs in its own thread, so g.db is its per-thread connection. Flask tears it down afterward. In background threads `def background_worker(): conn = sqlite3.connect(DATABASE) while True: # Do long or repeating work ...` Each worker opens its own connection once and reuses it until it’s done.
mishi commented 2025-11-09 15:25:26 +01:00 (Migrated from git.mattzz.de)

Requests get their own thread and their own db connection and we only have to make sure that non-request-triggered background tasks have their own connection and don't create interfering requests.

Requests get their own thread and their own db connection and we only have to make sure that non-request-triggered background tasks have their own connection and don't create interfering requests.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
mattzz/halloween#180
No description provided.