Codec section¶
A codec pipeline section lets you alter the character
encoding of item values, allowing you to recode text from
and to unicode and any of the codecs supported by python.
The codec section blueprint name is
collective.transmogrifier.sections.codec
.
What values to recode is determined by the
keys
option, which takes a set of newline-separated key names. If
a key name starts with
re:
or
regexp:
it is treated as a regular expression instead.
The optional
from
and
to
options determine what codecs values are recoded from and
to. Both these values default to
unicode
, meaning no translation. If either option is set to
default
, the current default encoding of the Plone site is used.
To deal with possible encoding errors, you can set the error
handler of both the
from
and
to
codecs separately with the
from-error-handler
and
to-error-handler
options, respectively. These default to
strict
, but can be set to any error handler supported by python,
including
replace
and
ignore
.
Also optional is the
condition
option, which lets you specify a TALES expression that when
evaluating to False will prevent any en- or decoding from
happening. The condition is evaluated for every matched key.
>>> codecs = """
... [transmogrifier]
... pipeline =
... source
... decode-all
... encode-id
... encode-title
... logger
...
... [source]
... blueprint = collective.transmogrifier.sections.tests.samplesource
... encoding = utf8
...
... [decode-all]
... blueprint = collective.transmogrifier.sections.codec
... keys = re:.*
... from = utf8
...
... [encode-id]
... blueprint = collective.transmogrifier.sections.codec
... keys = id
... to = ascii
...
... [encode-title]
... blueprint = collective.transmogrifier.sections.codec
... keys = title
... to = ascii
... to-error-handler = backslashreplace
... condition = python:'Brand' not in item['title']
...
... [logger]
... blueprint = collective.transmogrifier.sections.logger
... name = logger
... level = INFO
... """
>>> registerConfig(u'collective.transmogrifier.sections.tests.codecs',
... codecs)
>>> transmogrifier(u'collective.transmogrifier.sections.tests.codecs')
>>> print handler
logger INFO
{'id': 'foo', 'status': u'\u2117', 'title': 'The Foo Fighters \\u2117'}
logger INFO
{'id': 'bar', 'status': u'\u2122', 'title': u'Brand Chocolate Bar \u2122'}
logger INFO
{'id': 'monty-python', 'status': u'\xa9', 'title': "Monty Python's Flying Circus \\xa9"}
The
condition
expression has access to the following:
item
|
the current pipeline item |
key
|
the name of the matched key |
match
|
if the key was matched by a regular expression, the match object, otherwise boolean True |
transmogrifier
|
the transmogrifier |
name
|
the name of the splitter section |
options
|
the splitter options |
modules
|
sys.modules |