Commit graph

26 commits

Author SHA1 Message Date
Michal Schmidt 1abc85b8d0 job: avoid recursion into transaction code from job cancelation
I hit an "assert(j->installed)" failure in transaction_apply(). Looking
into the backtrace I saw what happened:
1. The system was booting. var.mount/start was an installed job.
2. I pressed Ctrl+Alt+Del.
3. reboot.target was going to be isolated.
4. transaction_apply() proceeded to install a var.mount/stop job.
5. job_install() canceled the conflicting start job.
6. Depending jobs ended recursively with JOB_DEPENDENCY, among them was
   local-fs.target/start.
7. Its OnFailure action triggered - emergency.target was now going to be
   isolated.
8. We recursed back into transaction_apply() where the half-installed
   var.mount/stop job confused us.

Recursing from job installation back into the transaction code cannot be
a good idea. Avoid the problem by canceling the conflicting job
non-recursively in job_install(). I don't think we'll miss anything by
not recursing here. After all, we are called from transaction_apply().
We will not be installing just this one job, but all jobs from a
transaction. All requirement dependencies will be included in it and
will be installed separately. Every transaction job will get a chance
to cancel its own conflicting installed job.
2012-10-25 03:21:32 +02:00
Zbigniew Jędrzejewski-Szmek 1ca6783f5e log: introduce a macro to format message id
The MESSAGE_ID=... stanza will appear in countless number of places.
It is just too long to write it out in full each time.

Incidentally, this also fixes a typo of MESSSAGE is three places.
2012-10-13 14:26:30 +02:00
Lennart Poettering 81270860a5 journal: suppress structured messages if they'd go to the console 2012-09-03 18:59:04 -07:00
Lennart Poettering 877d54e9b0 journal: generate structured journal messages for a number of events 2012-09-03 18:59:04 -07:00
Michal Schmidt 7cf82e0bb1 job: info message if JOB_VERIFY_ACTIVE detects an inactive unit 2012-05-14 14:29:54 +02:00
Michal Schmidt 1f136e7acf job: report the status of first half of JOB_RESTART the same as JOB_STOP 2012-05-14 14:29:54 +02:00
Michal Schmidt c69182961b unit: unit type dependent status messages
Instead of generic "Starting..." and "Started" messages for all unit use
type-dependent messages. For example, mounts will announce "Mounting..."
and "Mounted".

Add status messages to units of types that used to be entirely silent
(automounts, sockets, targets, devices). For unit types whose jobs are
instantaneous, report only the job completion, not the starting event.
Socket units with non-instantaneous jobs are rare (Exec*= is not used
often in socket units), so I chose not to print the starting messages
for them either.

This will hopefully give people better understanding of the boot.
2012-05-14 14:29:53 +02:00
Michal Schmidt 5f23d5b149 job: change red [ABORT] status to yellow [DEPEND]
The red "[ABORT]" for a dependency failure is too scary.
It suggests a crash. And it suggests a problem with the unit itself.
Change it to a yellow "[DEPEND]" message. The color communicates the
level of seriousness better.
2012-05-14 14:29:53 +02:00
Michal Schmidt 66aa6f7fbb job: only jobs on the runqueue can be run 2012-05-14 14:29:53 +02:00
Michal Schmidt e0209d83e7 core: add NOP jobs, job type collapsing
Two of our current job types are special:
JOB_TRY_RESTART, JOB_RELOAD_OR_START.

They differ from other job types by being sensitive to the unit active state.
They perform some action when the unit is active and some other action
otherwise. This raises a question: when exactly should the unit state be
checked to make the decision?

Currently the unit state is checked when the job becomes runnable. It's more
sensible to check the state immediately when the job is added by the user.
When the user types "systemctl try-restart foo.service", he really intends
to restart the service if it's running right now. If it isn't running right
now, the restart is pointless.

Consider the example (from Bugzilla[1]):

sleep.service takes some time to start.
hello.service has After=sleep.service.
Both services get started. Two jobs will appear:
 hello.service/start  waiting
 sleep.service/start  running
Then someone runs "systemctl try-restart hello.service".

Currently the try-restart operation will block and wait for
sleep.service/start to complete.

The correct result is to complete the try-restart operation immediately
with success, because hello.service is not running. The two original
jobs must not be disturbed by this.

To fix this we introduce two new concepts:
- a new job type: JOB_NOP
  A JOB_NOP job does not do anything to the unit. It does not pull in any
  dependencies. It is always immediately runnable. When installed to a unit,
  it sits in a special slot (u->nop_job) where it never conflicts with
  the installed job (u->job) of a different type. It never merges with jobs
  of other types, but it can merge into an already installed JOB_NOP job.

- "collapsing" of job types
  When a job of one of the two special types is added, the state of the unit
  is checked immediately and the job type changes:
  JOB_TRY_RESTART     -> JOB_RESTART or JOB_NOP
  JOB_RELOAD_OR_START -> JOB_RELOAD or JOB_START
  Should a job type JOB_RELOAD_OR_START appear later during job merging, it
  collapses immediately afterwards.
  Collapsing actually makes some things simpler, because there are now fewer
  job types that are allowed in the transaction.

[1] Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=753586
2012-04-25 18:38:27 +02:00
Michal Schmidt 5273510e9f transaction: cancel jobs non-recursively on isolate
Recursive cancellation of jobs would trigger OnFailure actions of
dependent jobs. This is not desirable when isolating.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=798328
2012-04-23 08:13:02 +02:00
Michal Schmidt 39a18c60d0 job: serialize jobs properly
Jobs were not preserved correctly over a daemon-reload operation.
A systemctl process waiting for a job completion received a job removal
signal. The job itself changed its id. The job timeout started ticking all
over again.

This fixes the deficiencies.
2012-04-24 01:54:15 +02:00
Lennart Poettering 78ff1acdfe job: the status messages are proper sentences, hence end them with a full stop 2012-04-22 15:39:26 +02:00
Michal Schmidt 97e6a11996 dbus-job: allow multiple bus clients
Merging of jobs can result in more than one client being interested in a job.
2012-04-20 17:12:29 +02:00
Michal Schmidt d6a093d098 transaction: remove checks for installed
Transactions cannot contain installed jobs anymore. Remove the now pointless
checks.
2012-04-20 17:12:29 +02:00
Michal Schmidt 656bbffc6c transaction: rework merging with installed jobs
Previously transactions could reference installed jobs. It made some issues
difficult to fix.

This sets new rules for jobs:
A job cannot be both a member of a transaction and installed. When jobs are
created, they are linked to a transaction. The whole transaction is constructed
(with merging of jobs within, etc.). When it's complete, all the jobs are
unlinked from it one by one and let to install themselves. It is during the
installation when merging with previously installed jobs (from older
transactions) is contemplated.

Merging with installed jobs has different rules than merging within a
transaction:
 - An installed conflicting job gets cancelled. It cannot be simply deleted,
   because someone might be waiting for its completion on DBus.
 - An installed, but still waiting, job can be safely merged into.
 - An installed and running job can be tricky. For some job types it is safe to
   just merge. For the other types we merge anyway, but put the job back into
   JOB_WAITING to allow it to run again. This may be suboptimal, but it is not
   currently possible to have more than one installed job for a unit.

Note this also fixes a bug where the anchor job could be deleted during merging
within the transaction.
2012-04-20 17:12:29 +02:00
Michal Schmidt 05d576f1f7 job: separate job_install()
Let the jobs install themselves.
2012-04-20 17:12:28 +02:00
Michal Schmidt e6eda1f23e transaction: remove the anchor link
tr->anchor_job is sufficient.
2012-04-20 17:12:28 +02:00
Michal Schmidt 1da4264fbd job: jobs shouldn't need to know about transaction anchors
Let the transactions maintain their own anchor links.
2012-04-20 17:12:28 +02:00
Michal Schmidt 668ad332a4 job: job_new() can find the manager from the unit 2012-04-20 17:12:27 +02:00
Michal Schmidt 7527cb5275 manager: Transaction as an object
This makes it obvious that transactions are short-lived. They are created in
manager_add_job() and destroyed after the application of jobs.
It also prepares for a split of the transaction code to a new source.
2012-04-20 17:12:27 +02:00
Michal Schmidt 97e7d748d1 job: job_uninstall()
Split the uninstallation of the job from job_free() into a separate function.
Adjust the callers.

job_free() now only works on unlinked and uninstalled jobs. This enforces clear
thinking about job lifetimes.
2012-04-20 17:12:27 +02:00
Michal Schmidt 02a3bcc6b4 job: allow job_free() only on already unlinked jobs
job_free() is IMO too helpful when it unlinks the job from the transaction.
The callers should ensure the job is already unlinked before freeing.
The added assertions check if anyone gets it wrong.
2012-04-20 17:12:27 +02:00
Lennart Poettering 691206668a unit: signal explicitly if a condition failed in unit_start()
We shouldn't print a status message on the console if we skipped a unit
due to a condition. Hence make unit_start() return -ENOEXEC in such a
case which is mapped to JOB_SKIPPED which results in no console message.
2012-04-13 19:46:51 +02:00
Lennart Poettering 5430f7f2bc relicense to LGPLv2.1 (with exceptions)
We finally got the OK from all contributors with non-trivial commits to
relicense systemd from GPL2+ to LGPL2.1+.

Some udev bits continue to be GPL2+ for now, but we are looking into
relicensing them too, to allow free copy/paste of all code within
systemd.

The bits that used to be MIT continue to be MIT.

The big benefit of the relicensing is that closed source code may now
link against libsystemd-login.so and friends.
2012-04-12 00:24:39 +02:00
Kay Sievers b30e2f4c18 move libsystemd_core.la sources into core/ 2012-04-11 16:03:51 +02:00
Renamed from src/job.c (Browse further)