13 Texinfo::Convert::Converter


13.1 Texinfo::Convert::Converter NAME

Texinfo::Convert::Converter - Parent class for Texinfo tree converters


13.2 Texinfo::Convert::Converter SYNOPSIS

  package Texinfo::Convert::MyConverter;

  use Texinfo::Convert::Converter;
  @ISA = qw(Texinfo::Convert::Converter);

  sub converter_defaults ($$) {
    return %myconverter_defaults;
  }
  sub converter_initialize($) {
    my $self = shift;
    $self->{'document_context'} = [{}];
  }

  sub convert($$) {
    ...
  }
  sub convert_tree($$) {
    ...
  }
  sub output($$) {
    ...
  }

  # end of Texinfo::Convert::MyConverter

  my $converter = Texinfo::Convert::MyConverter->converter(
                                               {'parser' => $parser});
  $converter->output($texinfo_tree);

13.3 Texinfo::Convert::Converter NOTES

The Texinfo Perl module main purpose is to be used in texi2any to convert Texinfo to other formats. There is no promise of API stability.


13.4 Texinfo::Convert::Converter DESCRIPTION

Texinfo::Convert::Converter is a super class that can be used to simplify converters initialization. The class also provide some useful methods.

In turn, the converter should define some methods. Two are optional, converter_defaults, converter_initialize and used for initialization, to give information to Texinfo::Convert::Converter.

The convert_tree method is mandatory and should convert portions of Texinfo tree. The output method is used by converters as entry point for conversion to a file with headers and so on. Although it is is not called from other modules, it should in general be implemented by converters. output is called from texi2any. convert is not required, but customarily used by converters as entry point for a conversion of a whole Texinfo tree without the headers done when outputting to a file.

Existing backends may be used as examples that implement those methods. Texinfo::Convert::Texinfo together with Texinfo::Convert::PlainTexinfo, as well as Texinfo::Convert::TextContent are trivial examples. Texinfo::Convert::Text is less trivial, although still simple, while Texinfo::Convert::DocBook is a real converter that is also not too complex.

The documentation of Texinfo::Common, Texinfo::Convert::Unicode and Texinfo::Report describes modules or additional function that may be useful for backends, while the parsed Texinfo tree is described in Texinfo::Parser.


13.5 Texinfo::Convert::Converter METHODS


13.5.1 Initialization

A module subclassing Texinfo::Convert::Converter is created by calling the converter method that should be inherited from Texinfo::Convert::Converter.

$converter = MyConverter->converter($options)

The $options hash reference holds options for the converter. In this option hash reference a parser object may be associated with the parser key. The other options are Texinfo customization options and a few other options that can be passed to the converter. Most of the customization options are described in the Texinfo manual. Those customization options, when appropriate, override the document content. TODO what about the other options (all are used in converters; ’structuring’ is available in HTML $converter->get_info()? The parser should not be available directly anymore after getting the associated information. TODO document this associated information (’parser_info’, ’indices_information’, ’floats’..., most available in HTML converter, either through $converter->get_info() or label_command())

The converter function returns a converter object (a blessed hash reference) after checking the options and performing some initializations, especially when a parser is given among the options. The converter is also initialized as a Texinfo::Report.

To help with these initializations, the modules subclassing Texinfo::Convert::Converter can define two methods:

%defaults = $converter->converter_defaults($options)

The module can provide a defaults hash for converter customization options. The $options hash reference holds options for the converter.

converter_initialize

This method is called at the end of the Texinfo::Convert::Converter converter initialization.


13.5.2 Getting and setting customization variables

Texinfo::Convert::Converter implements a simple interface to set and retrieve Texinfo customization variables. Helper functions from diverse Texinfo modules needing customization information expect an object implementing get_conf and/or set_conf. The converter itself can therefore be used in such cases.

$converter->force_conf($variable_name, $variable_value)

Set the Texinfo customization option $variable_name to $variable_value. This should rarely be used, but the purpose of this method is to be able to revert a customization that is always wrong for a given output format, like the splitting for example.

$converter->get_conf($variable_name)

Returns the value of the Texinfo customization variable $variable_name.

$status = $converter->set_conf($variable_name, $variable_value)

Set the Texinfo customization option $variable_name to $variable_value if not set as a converter option. Returns false if the customization options was not set.


13.5.3 Conversion to XML

Some Texinfo::Convert::Converter methods target conversion to XML. Most methods take a $converter as argument to get some information and use methods for error reporting.

$formatted_text = $converter->xml_format_text_with_numeric_entities($text)

Replace quotation marks and hyphens used to represent dash in Texinfo text with numeric XML entities.

$protected_text = $converter->xml_protect_text($text)

Protect special XML characters (&, <, >, ") of $text.

$comment = $converter->xml_comment($text)

Returns an XML comment for $text.

$result = xml_accent($text, $accent_command, $in_upper_case, $use_numeric_entities)

$text is the text appearing within an accent command. $accent_command should be a Texinfo tree element corresponding to an accent command taking an argument. $in_upper_case is optional, and, if set, the text is put in upper case. The function returns the accented letter as XML named entity if possible, falling back to numeric entities if there is no named entity and to an ASCII transliteration as last resort. $use_numeric_entities is optional. If set, numerical entities are used instead of named entities if possible.

$result = $converter->xml_accents($accent_command, $in_upper_case)

$accent_command is an accent command, which may have other accent commands nested. If $in_upper_case is set, the result should be upper cased. The function returns the accents formatted as XML.

$result = xml_numeric_entity_accent($accent_command_name, $text)

$accent_command_name is the name of an accent command. $text is the text appearing within the accent command. Returns the accented letter as XML numeric entity, or undef is there is no such entity.


13.5.4 Helper methods

The module provides methods that may be useful for converter. Most methods take a $converter as argument to get some information and use methods for error reporting, see Texinfo::Report. Also to translate strings, see Texinfo::Translations. For useful methods that need a converter optionally and can be used in converters that do not inherit from Texinfo::Convert::Converter, see Texinfo::Convert::Utils.

$contents_element = $converter->comma_index_subentries_tree($entry, $separator)

$entry is a Texinfo tree index entry element. The function sets up an array with the @subentry contents. The result is returned as contents in the $contents_element element, or undef if there is no such content. $separator is an optional separator argument used, if given, instead of the default: a comma followed by a space.

$result = $converter->convert_accents($accent_command, \&format_accents, $output_encoded_characters, $in_upper_case)

$accent_command is an accent command, which may have other accent commands nested. The function returns the accents formatted either as encoded letters if $output_encoded_characters is set, or formatted using \&format_accents. If $in_upper_case is set, the result should be uppercased.

$result = $converter->convert_document_sections($root, $file_handler)

This method splits the $root Texinfo tree at sections and calls convert_tree on the elements. If the optional $file_handler is given in argument, the result are output in $file_handler, otherwise the resulting string is returned.

$succeeded = $converter->create_destination_directory($destination_directory_path, $destination_directory_name)

Create destination directory $destination_directory_path. $destination_directory_path should be a binary string, while $destination_directory_name should be a character string, that can be used in error messages. $succeeded is true if the creation was successful or uneeded, false otherwise.

($output_file, $destination_directory, $output_filename, $document_name, $input_basefile) = $converter->determine_files_and_directory($output_format)

Determine output file and directory, as well as names related to files. The result depends on the presence of @setfilename, on the Texinfo input file name, and on customization options such as OUTPUT, SUBDIR or SPLIT, as described in the Texinfo manual. $output_format is optional. If it is not set the current output format, if defined, is used instead. If not an empty string, _$output_format is prepended to the default directory name.

$output_file is mainly relevant when not split and should be used as the output file name. In general, if not split and $output_file is an empty string, it means that text should be returned by the converter instead of being written to an output file. This is used in the test suite. $destination_directory is either the directory $output_file is in, or if split, the directory where the files should be created. $output_filename is, in general, the file name portion of $output_file (without directory) but can also be set based on @setfilename, in particular when $output_file is an empty string. $document_name is $output_filename without extension. $input_basefile is based on the input texinfo file name, with the file name portion only (without directory).

The strings returned are text strings.

($encoded_name, $encoding) = $converter->encoded_input_file_name($character_string_name, $input_file_encoding)
($encoded_name, $encoding) = $converter->encoded_output_file_name($character_string_name)

Encode $character_string_name in the same way as other file names are encoded in the converter, based on customization variables, and possibly on the input file encoding. Return the encoded name and the encoding used to encode the name. The encoded_input_file_name and encoded_output_file_name functions use different customization variables to determine the encoding.

The <$input_file_encoding> argument is optional. If set, it is used for the input file encoding. It is useful if there is more precise information on the input file encoding where the file name appeared.

Note that encoded_output_file_name is a wrapper around the function with the same name in Texinfo::Convert::Utils::encoded_output_file_name, and encoded_input_file_name is a wrapper around the function with the same name in Texinfo::Convert::Utils::encoded_input_file_name.

($caption, $prepended) = $converter->float_name_caption($float)

$float is a texinfo tree @float element. This function returns the caption element that should be used for the float formatting and the $prepended texinfo tree combining the type and label of the float.

$tree = $converter->float_type_number($float)

$float is a texinfo tree @float element. This function returns the type and number of the float as a texinfo tree with translations.

$end_line = $converter->format_comment_or_return_end_line($element)

Format comment at end of line or return the end of line associated with the element. In many cases, converters ignore comments and output is better formatted with new lines added independently of the presence of newline or comment in the initial Texinfo line, so most converters are better off not using this method.

$filename = sub $converter->node_information_filename($normalized, $node_contents)

Returns the normalized file name corresponding to the $normalized node name and to the $node_contents node name contents.

($normalized_name, $filename) = $converter->normalized_sectioning_command_filename($element)

Returns a normalized name $normalized_name corresponding to a sectioning command tree element $element, expanding the command argument using transliteration and characters protection. Also returns $filename the corresponding filename based on $normalized_name taking into account additional constraint on file names and adding a file extension.

$converter->present_bug_message($message, $element)

Show a bug message using $message text. Use information on $element tree element if given in argument.

$converter->set_global_document_commands($commands_location, $selected_commands)

Set the Texinfo customization options for @-commands. $selected_commands is an optional array reference containing the @-commands set, if not given all the global informative @-commands are set. $commands_location specifies where in the document the value should be taken from. The possibilities are:

before

Set to the values before document conversion, from defaults and command-line.

last

Set to the last value for the command.

preamble

Set sequentially to the values in the Texinfo preamble.

preamble_or_first

Set to the first value of the command if the first command is not in the Texinfo preamble, else set as with preamble, sequentially to the values in the Texinfo preamble.

Notice that the only effect of this function is to set a customization variable value, no @-command side effects are run, no associated customization variables are set.

For more information on the function used to set the value for each of the command, see Texinfo::Common set_global_document_command.

$table_item_tree = $converter->table_item_content_tree($element, $contents)

$element should be an @item or @itemx tree element, $contents should be corresponding texinfo tree contents. Returns a tree in which the @-command in argument of @*table of the $element has been applied to $contents.

$result = $converter->top_node_filename($document_name)

Returns a file name for the Top node file using either TOP_FILE customization value, or EXTENSION customization value and $document_name.

Finally, there is:

$result = $converter->output_internal_links()

At this level, the method just returns undef. It is used in the HTML output, following the --internal-links option of texi2any specification.


13.6 Texinfo::Convert::Converter SEE ALSO

Texinfo::Common, Texinfo::Convert::Unicode, Texinfo::Report, Texinfo::Translations, Texinfo::Convert::Utils and Texinfo::Parser.


13.7 Texinfo::Convert::Converter AUTHOR

Patrice Dumas, <pertusus@free.fr>