vis

a vi-like editor based on Plan 9's structural regular expressions

git clone https://9o.is/git/vis.git

text.rst

(2994B)


      1 Text
      2 ====
      3 
      4 The core text management data structure which supports efficient
      5 modifications and provides a byte string interface. Text positions
      6 are represented as ``size_t``.  Valid addresses are in range ``[0,
      7 text_size(txt)]``. An invalid position is denoted by ``EPOS``. Access to
      8 the non-contigiuos pieces is available by means of an iterator interface
      9 or a copy mechanism. Text revisions are tracked in an history graph.
     10 
     11 .. note:: The text is assumed to be encoded in `UTF-8 <https://tools.ietf.org/html/rfc3629>`_.
     12 
     13 Load
     14 ----
     15 
     16 .. doxygengroup:: load
     17    :content-only:
     18 
     19 State
     20 -----
     21 
     22 .. doxygengroup:: state
     23    :content-only:
     24 
     25 Modify
     26 ------
     27 
     28 .. doxygengroup:: modify
     29    :content-only:
     30 
     31 Access
     32 ------
     33 
     34 The individual pieces of the text are not necessarily stored in a
     35 contiguous memory block. These functions perform a copy to such a region.
     36 
     37 .. doxygengroup:: access
     38    :content-only:
     39 
     40 Iterator
     41 --------
     42 
     43 An iterator points to a given text position and provides interfaces to
     44 adjust said position or read the underlying byte value. Functions which
     45 take a ``char`` pointer will generally assign the byte value *after*
     46 the iterator was updated.
     47 
     48 .. doxygenstruct:: Iterator
     49 
     50 .. doxygengroup:: iterator
     51    :content-only:
     52 
     53 Byte
     54 ^^^^
     55 
     56 .. note:: For a read attempt at EOF (i.e. `text_size`) an artificial ``NUL``
     57           byte which is not actually part of the file is returned.
     58 
     59 .. doxygengroup:: iterator_byte
     60    :content-only:
     61 
     62 Codepoint
     63 ^^^^^^^^^
     64 
     65 These functions advance to the next/previous leading byte of an UTF-8
     66 encoded Unicode codepoint by skipping over all continuation bytes of
     67 the form ``10xxxxxx``.
     68 
     69 .. doxygengroup:: iterator_code
     70    :content-only:
     71 
     72 Grapheme Clusters
     73 ^^^^^^^^^^^^^^^^^
     74 
     75 These functions advance to the next/previous grapheme cluster. 
     76 
     77 .. note:: The grapheme cluster boundaries are currently not implemented
     78           according to `UAX#29 rules <http://unicode.org/reports/tr29>`_.
     79           Instead a base character followed by arbitrarily many combining
     80           character as reported by ``wcwidth(3)`` are skipped.
     81 
     82 .. doxygengroup:: iterator_char
     83    :content-only:
     84 
     85 Lines
     86 -----
     87 
     88 Translate between 1 based line numbers and 0 based byte offsets.
     89 
     90 .. doxygengroup:: lines
     91    :content-only:
     92 
     93 History
     94 -------
     95 
     96 Interfaces to the history graph.
     97 
     98 .. doxygengroup:: history
     99    :content-only:
    100 
    101 Marks
    102 -----
    103 
    104 A mark keeps track of a text position. Subsequent text changes will update
    105 all marks placed after the modification point. Reverting to an older text
    106 state will hide all affected marks, redoing the changes will restore them.
    107 
    108 .. warning:: Due to an optimization cached modifications (i.e. no ``text_snapshot``
    109              was performed between setting the mark and issuing the changes) might
    110              not adjust mark positions accurately.
    111 
    112 .. doxygentypedef:: Mark
    113 
    114 .. doxygendefine:: EMARK
    115 
    116 .. doxygengroup:: mark
    117    :content-only:
    118 
    119 Save
    120 ----
    121 
    122 .. doxygengroup:: save
    123    :content-only:
    124 
    125 Miscellaneous
    126 -------------
    127 
    128 .. doxygengroup:: misc
    129    :content-only: