This page summarizes what has been done recently with ZODB Components and from September 2010 to February 2011, related to portal type classes, ZODB property sheets, and accessors generation.

Contents

Portal type class

What has been done?
As a developer, what do I need to pay attention to?

You cannot create instances of documents directly:

container._setObject("some_id", MyUberDocumentClass("some_id"))

This use should be banned, and will break. Instead, you should use:

container.newContent(portal_type="...", id="....")

You can experiment to understand the difference: the first use creates a Document instance, the second used creates a portal type class. We only want portal type classes.

What needs to be done?
Hacking advanced notes

ZODB Property Sheets

What has been done?
As a developer, what do I need to pay attention to?
What needs to be done?
Hacking advanced notes

Constraints

What has been done?
As a developer, what do I need to pay attention to?
Hacking advanced notes

ZODB Components

This only explains how to use ZODB Components from a developer point of view and what has changed since the presentation at Europython 2012.

What has been done?

Documents, Extensions and Tests in bt5 are no longer stored in instance home but directly into ZODB. This provides several benefits, such as:

What I should know as a developer?

The general technical implementation of ZODB Components for ERP5 is documented in the slides written for Europython 2012. You can find the slides and notes there and the video there. Even though there has been some changes since then, the general idea still stands so this document is still worth reading.

Note that TYPE, used in this document, refers to ZODB Component type and may be currently equal to document, extension or test.

Here is what you should know when using ZODB Components:

ID Naming convention

Even though ID could be anything, it should be in the following format:

TYPE.VERSION.REFERENCE

Migrating bt5 Documents, Extensions and Tests from filesystem actually follows this naming convention.

Adding custom version

For projects, you should add at least one specific version, which can be achieved by the following steps:

  1. Add a version and its priority in Portal Properties, for example: project | 60.0
  2. Add this version to Registered Version Priority Selection field in Business Template view.

For ERP5 Components, there is already erp5 version defined in erp5_core, so you don``t need to add anything.

Migration of existing bt5 Documents, Extensions, Tests and Products

Except for ERP5-specific version, you should create a new version, see previous section for that.

You can migrate Business Template thanks to Migrate Components from Filesystem action in Business Template view. In the next screen, you can specify versions of Components to be migrated.

Note that the migration is all or nothing and ZODB Components will not be automatically validated.

Also, Products in bt5 are deprecated, instead you must either migrate your Products to Documents or move them to normal Products, through your SlapOS Software Release recipe.

Extension Component (erp5.component.extension)

When adding an External Method, you can specify Module Name exactly as you used to do.

For example, a ZODB Extension Component whose version is project, reference is Bar and ID is extension.project.Bar, you must only specify Bar. Unless you have an Extension Component with the same reference and whose version has an higher priority, then it will be used automatically. From an implementation point of view, this will actually import erp5.component.extension.Bar, equivalent to erp5.component.extension.HIGHEST_PRIORITY_VERSION_version.Bar.

By default, when specifying Bar as Module Name, ZODB Components will be lookup and if there is no such Components, then it will fallback on the filesystem.

Document Component (erp5.component.document)

Likewise filesystem bt5 Document, ZODB Document Components in bt5 must only be used as Portal Types Type Class. But if you use these documents in tests for example, you must use erp5.component.document instead of erp5.document.

Test Component (erp5.component.test)

Basically, a Test Component behaves like a Document Component. However, as a Test Component is within a bt5, there is a chicken & egg issue with runUnitTest command for installation of bt5 dependencies because in current Unit Test, the list of required bt5s is defined in getBusinessTemplateList() class method which requires to load the Component. However, it cannot be loaded until the bt5 (and its dependencies) have been installed as it may depend on Document or other Test Components.

One solution would have been to fiddle with sys.path and implement workaround to load the Component without installing any bt5, but that would be hackish and would not work when trying to import Document Components.

Therefore, the solution implemented is to specify through runUnitTest command line the bt5 where the test can be found. This bt5 will be installed as well as its dependencies using bt5list file (so you must make sure that this file is up-to-date before doing running any test).

Migration steps:

  1. This should already be the case but all the bt5 dependencies must be properly defined (dependency_list Business Template property or Dependencies field on Business Template view).

  2. For bt5s required specifically to run tests, there is a new property, test_dependency_list (Test Dependencies on Business Template view) where they can be added. Please note that in contrary to filesystem test, the bt5 are not forced installed so you must define all dependencies, including solving virtual dependencies (for example, for erp5_full_text_catalog, you can add erp5_full_text_myisam_catalog to Test Dependencies).

  3. For customer project, make sure that your SlapOS recipe generates bt5list for your customer bt5s. Also, to your customer tests/__init__.py, add the following path to your tests path:

    %s/bt5/*/TestTemplateItem/portal_components/test.*.test*.py
    

Finally, to execute a Test Component as a Live Tests, you can do through Run Live Tests Component Tool Action. As of runUnitTest command considering that testFoo is in bt5 called hogehoge:

runUnitTest hogehoge:testFoo
Procedure to upgrade customer project
  1. With erp5.git before merge request 1032 (2b8c630500f8a65566cf5ccf76b5215add840e54):
    1. Replace deprecated newTempXXX calls as done in 26e3c68b10be9165318dec9c02184dca0398d3e4.
    2. Migrate all customer Products to ZODB Components to their appropriate bt5s using Migrate Components from Filesystem Business Template Action. This will automatically select classes used in the current Business Template Portal Types and will not delete anything from the filesystem once done. Also, this will fix imports only for the migrated files..
    3. Commit.
  1. With current erp5.git:
    1. Delete .pyc files to make sure now deleted Documents are not loaded: git clean -xdf product/
    2. Fix imports (name of the module before migration from the FS is in source_reference property, you can use https://lab.nexedi.com/nexedi/erp5/uploads/cef2ce5429e7abb9c9d0ab53b75b6593/fix_imports script for now: ./fix_imports /path/to/erp5/repository/ /path/to/customer/repository/
    3. Commit.
    4. Regenerate bt5list: ./product/ERP5/bin/genbt5list bt5 product/ERP5/bootstrap
    5. Update all bt5s.
    6. Filesystem Products can now be deleted from FS. Also make sure to remove .pyc files: git clean -xdf -e bt5list product/.
    7. Commit.
Major changes since the presentation at Europython 2012

This section lists major changes, excluding bootstrap issue, minor bug fixes and UI improvements here and there.

Security

Access to Component Tool has been further restricted (anyone was able to view Components) and is now set through Component Tool class rather than instance, so it can be easily changed anytime, rather than only being set at creation or through an upgrade script.

Pylint

Before, in order to check that the source code was somewhat valid, the code was actually executing, but this approach has the following drawbacks:

  • Executing the source code may have side-effect, such as importing module or for monkey-patches.
  • Does not detect error in code executed later (function...).
  • Only the first error was reported.

The source code is now checked statically through Pylint if it can be imported, otherwise it fallbacks on executing the source code as before (please note that Pylint has been added to SlapOS recipe specifically for ZODB Components so you may need to update your environment).

Pylint has been chosen in favor of (faster) other implementation such as pyflakes because it can also check coding style and naming conventions, which will be used in the future. Moreover, it seems to report errors that pyflakes could not find.

As a side note, edition of ZODB Components through Ace Editor has been greatly improved so you can click directly on the errors or warning and it will go the corresponding line and column.

Import lock deadlock

Upon any import in Python < 3.3, the Python global import lock is acquired (to avoid race conditions while checking sys.path, avoid incomplete modules from being seen by other threads or processes and also to avoid a module from being executed twice). Therefore, ZODB Component import hooks (following PEP302) are protected by import lock.

However, there was a deadlock when trying to import Components, as these import hooks tries to fetch properties from ZODB, which may unpickle objects in another thread (for an Exception class with ZEO for example) and thus trying to acquire import lock when importing classes.

From now on, the import lock is released in import hooks until sys.path is actually modified and there is another lock (aq_method_lock, common to Portal Type as Classes and ZODB Property Sheets) to prevent entering import hooks in parallel.

A solution would be to introduce a per Component package lock (still coarse grain) or a per Component lock (finest grain we could do) if the performances end up being too bad but it seems to be working well enough as it is.

The best solution would be to use Python 3.3, as there is no more global lock but a lock per module. However, the import machinery has completely changed (implemented in Python and not C anymore), so it would be probably quite difficult to backport...

Export of Workflow State

A Component was automatically validated on Business Template installation, but this meant, among other things, that exchanging bt5 containing Components with errors was not possible.

The last Workflow History of Component Validation Workflow is now exported without adding anything, thanks to a new Property introduced in Business Template (before, you could only export the full Workflow History).

What need to be done or could be implemented later on?
Do not propagate changes on other nodes

As requested by a customer, this could be fairly useful to be able to change ZODB Components on one specific node before the changes are actually propagated on other nodes (for example, when fixing a bug on production to allow testing on only one node).

Instropection

Being able to know where a given ZODB Component comes from and where it is currently used (which Portal Type classes, which Property Sheets and so on). This should be common to ZODB Property Sheets and Portal Type as Classes.

Migration of Products

Once bugs found with bt5 ZODB Components and other bugs requiring restart of ERP5 have been fixed, the next milestone is to migrate filesystem Products to ZODB.

Partial reset

For now and likewise ZODB Property Sheets and Portal Type as Classes, everytime a ZODB Component is modified, a reset of all ZODB Components is done, but after implementing nicely introspection, it should be possible to only reset the modified Components and its dependencies.

[DONE] Remove ClassTool

ClassTool is no longer necessary and could probably be removed after the merge.

Pylint without intermediate file on the filesystem

Currently, an intermediate file on the filesystem is used to perform static checking, but this should not be necessary. Therefore, pylint should be patched so that it can take a string instead of a filename.

Accessor generation

What has been done?

We're getting rid of Base._aq_dynamic, and accessors are now generated directly from Property Sheet definitions, and put into Accessor Holder: one Accessor Holder for each existing Property Sheet item.

Accessor Holder are classes, and you can see them in the method resolution order of your ERP5 objects. For instance, for a person, person._``_class_``_.mro() is:

(<class 'erp5.portal_type.Person'>,
 <class 'Products.ERP5.Document.Person.Person'>,
 <class Products.ERP5.mixin.encrypted_password.EncryptedPasswordMixin at 0xcce42cc>,
 <class 'Products.ERP5Type.XMLObject.XMLObject'>,
 <class 'Products.ERP5Type.Core.Folder.Folder'>,
 <class Products.ERP5Type.CopySupport.CopyContainer at 0xb340b0c>,
 <class 'Products.CMFCore.CMFBTreeFolder.CMFBTreeFolder'>,
 <class 'Products.BTreeFolder2.BTreeFolder2.BTreeFolder2Base'>,
 [...]
 <class 'erp5.accessor_holder.BaseAccessorHolder'>,
 <class 'erp5.accessor_holder.DublinCore'>,
 <class 'erp5.accessor_holder.Task'>,
 <class 'erp5.accessor_holder.Reference'>,
 <class 'erp5.accessor_holder.Person'>,
 <class 'erp5.accessor_holder.DefaultImage'>,
 <class 'erp5.accessor_holder.Mapping'>,
 <class 'erp5.accessor_holder.CategoryCore'>,
 <class 'erp5.accessor_holder.Base'>,
 <class 'erp5.accessor_holder.Login'>,
 <class 'erp5.accessor_holder.XMLObject'>,
 <class 'erp5.accessor_holder.Folder'>,
 <class 'erp5.accessor_holder.SimpleItem'>,
 <type 'ExtensionClass.Base'>,
 <type 'object'>)

Each accessor_holder class corresponds to accessors that come directly from a Property Sheet. Note as well accessor_holder.BaseAccessorHolder which contains common methods such as related category getters and portal type group getters.

_aq_reset is gone as well.

As a developer, what do I need to pay attention to?
What needs to be done?
Hacking advanced notes
Performances

The effective tradeoff of this change is the following: we trade dynamic lazy generation for static generation plus a few mro()-deep lookups. Check for example the Base._edit code, where we have to lookup in a class mro() to fetch the list of restricted methods. This kind of places where we have to walk one's mro() are costly places. On the other hand, it's EASY to optimize them. With lazy aq_dynamic, environment was constantly changing, and we had no guarantees that everything was generated. But with portal type classes/accessor holders, nothing ever changes once the class has been generated once: at the end of loadClass() (ERP5.dynamic.lazy_class) nothing will ever happen to the class anymore, data is "static". So it means that all deep computations we do can be safely cached on the class object for later. Back to our _edit example, the list of method ids that are restricted can be, and should probably be computed once and stored on the portal type class

A new performance test can now be written. On a tiny instance, that only has erp5_core:

portal.portal_types.resetDynamicDocuments()
for property_sheet in portal.portal_property_sheets.contentValues():
  property_sheet.createAccessorHolder()

And time this loop.

The impact of accessor generation is now easy to measure and improve, instead of being a giant octopus with tentacles that unfold at every dynamic call.

Once you've looped over this list, you're mostly done, and the rest of the code only gathers useful accessor holders and puts them on the right classes. Cherry picking with workflow twists, as you still need to wrap accessors as WorkflowMethod on the portal type class. We may start with a relatively higher cost, but that's easier to improve, easier to profile, easier to optimize.

If then, you want to assess the cost of Workflow method generation, you can do something like:

portal.portal_types.resetDynamicDocuments()
for portal_type_id in portal.portal_types.objectIds():
  getattr(erp5.portal_type, portal_type_id).loadClass()

And once again, time it.

Memory Cost

Generally speaking, we generate things less blindly, and after cleanups the memory usage should drop to a lower figure than with aq_dynamic.

The globals in Utils are evil, and cache too much. I suppose that removing them or emptying them WILL save a lot of memory. Similarly, I'm questioning the validity and use of the workflow_method_registry attributes on portal type/property holder/accessor holder classes

resetDynamicDocuments vs resetDynamicDocumentsAtTransactionBoundary

There was something relatively bad in the way we were using _aq_reset. Scenario:

# some_portal_type
portal_type.edit(type_class="Foo", type_base_category_list=["source",])

This edition triggers two workflow triggers, one for the class change, and one for the base category change. Each trigger used to cause an _aq_reset call.

Generally speaking, if during one transaction we had N property changes on M different objects, we would trigger N*M times _aq_reset. That begs the question: is it absolutely compulsory to reset accessors immediately after one's action?

If we think about it, 100% of the actions that can trigger accessor regeneration are user-triggered. Meaning that transactions will be short-lived, and that in case of a success, a commit() will happen under a short time.

So can we delay the reset at commit time? Yes, it seemed so.

It has a few nice properties:

  • if one edit triggers several workflow triggers, only one reset will happen.
  • In tests, if we do pay attention at what we're doing, we can group portal types / accessor / base category setups and minimize the number of resets

Why did I care so much about the number of resets? With new accessor generation, we do a bit more during generation; and especially the generation of basic properties are very costly. So chaining several resets is costly, much more than two aq_resets.