A New Markdown parser for PHP5.4

Related tags

Markup Ciconia
Overview

Ciconia - A New Markdown Parser for PHP

Latest Stable Version Build Status Coverage Status SensioLabsInsight

The Markdown parser for PHP5.4, it is fully extensible. Ciconia is the collection of extension, so you can replace, add or remove each parsing mechanism.

Try Demo / Docs / Supported Syntax / API Reference

Requirements

  • PHP5.4+
  • Composer

Installation

create a composer.json

{
    "require": {
        "kzykhys/ciconia": "~1.0.0"
    }
}

and run

php composer.phar install

Usage

Traditional Markdown

use Ciconia\Ciconia;

$ciconia = new Ciconia();
$html = $ciconia->render('Markdown is **awesome**');

// <p>Markdown is <em>awesome</em></p>

Github Flavored Markdown

To activate 6 gfm features:

use Ciconia\Ciconia;
use Ciconia\Extension\Gfm;

$ciconia = new Ciconia();
$ciconia->addExtension(new Gfm\FencedCodeBlockExtension());
$ciconia->addExtension(new Gfm\TaskListExtension());
$ciconia->addExtension(new Gfm\InlineStyleExtension());
$ciconia->addExtension(new Gfm\WhiteSpaceExtension());
$ciconia->addExtension(new Gfm\TableExtension());
$ciconia->addExtension(new Gfm\UrlAutoLinkExtension());

$html = $ciconia->render('Markdown is **awesome**');

// <p>Markdown is <em>awesome</em></p>

Options

Option Type Default Description
tabWidth integer 4 Number of spaces
nestedTagLevel integer 3 Max depth of nested HTML tags
strict boolean false Throws exception if markdown contains syntax error
use Ciconia\Ciconia;

$ciconia = new Ciconia();
$html = $ciconia->render(
    'Markdown is **awesome**',
    ['tabWidth' => 8, 'nestedTagLevel' => 5, 'strict' => true]
);

Rendering HTML or XHTML

Ciconia renders HTML by default. If you prefer XHTML:

use Ciconia\Ciconia;
use Ciconia\Renderer\XhtmlRenderer;

$ciconia = new Ciconia(new XhtmlRenderer());
$html = $ciconia->render('Markdown is **awesome**');

// <p>Markdown is <em>awesome</em></p>

Extend Ciconia

How to Extend

Creating extension is easy, just implement Ciconia\Extension\ExtensionInterface.

Your class must implement 2 methods.

void register(Ciconia\Markdown $markdown)

Register your callback to markdown event manager. Ciconia\Markdown is instance of Ciconia\Event\EmitterInterface (looks like Node.js's EventEmitter)

string getName()

Returns the name of your extension. If your name is the same as one of core extension, it will be replaced by your extension.

Extension Example

This sample extension turns @username mentions into links.

<?php

use Ciconia\Common\Text;
use Ciconia\Extension\ExtensionInterface;

class MentionExtension implements ExtensionInterface
{

    /**
     * {@inheritdoc}
     */
    public function register(\Ciconia\Markdown $markdown)
    {
        $markdown->on('inline', [$this, 'processMentions']);
    }

    /**
     * @param Text $text
     */
    public function processMentions(Text $text)
    {
        // Turn @username into [@username](http://example.com/user/username)
        $text->replace('/(?:^|[^a-zA-Z0-9.])@([A-Za-z]+[A-Za-z0-9]+)/', function (Text $w, Text $username) {
            return '[@' . $username . '](http://example.com/user/' . $username . ')';
        });
    }

    /**
     * {@inheritdoc}
     */
    public function getName()
    {
        return 'mention';
    }
}

Register your extension.

<?php

require __DIR__ . '/vendor/autoload.php';

$ciconia = new \Ciconia\Ciconia();
$ciconia->addExtension(new MentionExtension());
echo $ciconia->render('@kzykhys my email address is [email protected]!');

Output

<p><a href="http://example.com/user/kzykhys">@kzykhys</a> my email address is [email protected]!</p>

Each extension handles string as a Text object. See API section of kzykhys/Text.

Events

Possible events are:

Event Description
initialize Document level parsing. Called at the first of the sequence.
block Block level parsing. Called after initialize
inline Inline level parsing. Generally called by block level parsers.
detab Convert tabs to spaces. Generally called by block level parsers.
outdent Remove one level of line-leading tabs or spaces. Generally called by block level parsers.
finalize Called after block

See the source code of Extensions

See events and timing information

Create your own Renderer

Ciconia supports HTML/XHTML output. but if you prefer customizing the output, just create a class that implements Ciconia\Renderer\RendererInterface.

See Ciconia\Renderer\RendererInterface

Command Line Interface

Usage

Basic Usage: (Outputs result to STDOUT)

ciconia /path/to/file.md

Following command saves result to file:

ciconia /path/to/file.md > /path/to/file.html

Or using pipe (On Windows in does't work):

echo "Markdown is **awesome**" | ciconia

Command Line Options

 --gfm                 Activate Gfm extensions
 --compress (-c)       Remove whitespace between HTML tags
 --format (-f)         Output format (html|xhtml) (default: "html")
 --lint (-l)           Syntax check only (lint)

Where is the script?

CLI script will be installed in vendor/bin/ciconia by default. To change the location:

Yes, there are two ways an alternate vendor binary location can be specified:

  1. Setting the bin-dir configuration setting in composer.json
  2. Setting the environment variable COMPOSER_BIN_DIR

http://getcomposer.org/doc/articles/vendor-binaries.md

Using PHAR version

You can also use single phar file

ciconia.phar /path/to/file.md

If you prefer access this command globally, download ciconia.phar and move it into your PATH.

mv ciconia.phar /usr/local/bin/ciconia

Testing

Install or update dev dependencies.

php composer.phar update --dev

and run phpunit

License

The MIT License

Contributing

Feel free to fork this repository and send a pull request. (A list of contributors)

Author

Kazuyuki Hayashi (@kzykhys)

Comments
  • Parsing error with italic and URLs

    Parsing error with italic and URLs

    Hi

    First off, excellent lib! I really like it :-) I came across an issue where Ciconia is transforming URLs in <a> tags as well as <img> tags (don't know if it affects even more).

    How to reproduce:

    1. Install Ciconia via Composer "kzykhys/ciconia": "1.*"
    2. Use the following script:
    <?php
    
    include 'vendor/autoload.php';
    
    use Ciconia\Ciconia;
    
    $ciconia = new Ciconia();
    echo $ciconia->render('<a href="assets/images/5/tab_data_n_origin.png">Image</a>');
    
    1. The output will be <p><a href="assets/images/5/tab<em>data</em>n_origin.png">Image</a></p>

    which is wrong because it should not touch inline HTML code (see http://daringfireball.net/projects/markdown/syntax#html).

    opened by Toflar 9
  • Using sabre/event for event management

    Using sabre/event for event management

    Hi!

    The attached pull request migrates from using an internal event management system to using sabre/event. It has a bit more options, but still has the same feature set; and the method signatures are nearly identical.

    The main differences are:

    • If an event handler (subscribed with on) returns a literal false, event handling stops.
    • There is no buildParameters, but it's easy to add this feature anyway.
    • The default priority is 100 instead of 10, so I multiplied every priority with 10.

    I hope you like it. I realize this is a selfish PR as I'd like to see my library used more. I would be willing to add EmitterAwareInterface and EmitterAwareTrait to sabre/event, if that helps convincing you :)

    opened by evert 6
  • Chaining extensions

    Chaining extensions

    I wrote a small extension that helps me integrate bootstrap's grid system. But when the extension gets executed, it does not gets parsed as a regular paragraph block anymore. Is there a way to to "chain" extensions so that the parsing does not stop on 1 extension?

    For example, if I have something like this:

    {.col-md-6} This is a paragraph
    

    It will be wrapped with <div class=".col-md-6"> by my extension, but it will not be transformed to a paragraph.

    opened by jenssegers 5
  • Footnotes

    Footnotes

    Do you have plans to support this syntax?

    I get 10 times more traffic from [Google] 1 than from [Yahoo] [2] or [MSN] [3].

    [2]: http://search.yahoo.com/ "Yahoo Search" [3]: http://search.msn.com/ "MSN Search"

    opened by inoryy 5
  • Array syntax?

    Array syntax?

    Original title was 'Why 5.4?' but just noticed you are using traits.

    ~~Yes I could easily dig through the code (and I have to an extent) but I was wondering what the need for 5.4 was?~~

    I noticed that you are not using the new short array syntax $array = []; (which was introduced in 5.4), any reason why?

    opened by m4tthumphrey 4
  • InlineStyleExtension bug for img tag when using underscores

    InlineStyleExtension bug for img tag when using underscores

    ![thumb_01.png](/uploads/0001/01/thumb_01.png)
    

    renders as:

    <img src="/uploads/0001/01/thumb&lt;em&gt;01.png" alt="thumb&lt;/em&gt;01.png">
    

    instead of:

    <img src="/uploads/0001/01/thumb_01.png" alt="thumb_01.png">
    
    opened by Pym 3
  • Improve performance

    Improve performance

    $ php bin/markbench benchmark --profile=github-sample
    Runtime: PHP5.5.3
    Host:    Linux vm1 3.8.0-31-generic #46-Ubuntu SMP Tue Sep 10 20:03:44 UTC 2013 x86_64
    Profile: Sample content from Github (http://github.github.com/github-flavored-markdown/sample_content.html) / 1000 times
    Class:   Markbench\Profile\GithubSampleProfile
    
    +----------------------+---------+---------+---------------+---------+--------------+
    | package              | version | dialect | duration (MS) | MEM (B) | PEAK MEM (B) |
    +----------------------+---------+---------+---------------+---------+--------------+
    | erusev/parsedown     | 0.4.6   |         | 10819         | 6291456 | 6553600      |
    | michelf/php-markdown | 1.3     |         | 36887         | 6815744 | 6815744      |
    | michelf/php-markdown | 1.3     | extra   | 49626         | 6815744 | 7340032      |
    | kzykhys/ciconia      | v0.1.4  |         | 64959         | 7340032 | 7602176      |
    | kzykhys/ciconia      | v0.1.4  | gfm     | 68987         | 7077888 | 7602176      |
    +----------------------+---------+---------+---------------+---------+--------------+
    
    wontfix 
    opened by kzykhys 3
  • Hook to process links

    Hook to process links

    Hi!

    I'm working on a static site generator with Ciconia.

    I would like to automatically be able to process all the links in the markdown source, and prepend the link with a base url (if they are relative).

    Ideally, I would also be able to automatically add .md to the resulting like.

    To do this effectively, it would be awesome if it were somehow possible to add a callback that allows me to process and rewrite links... For example something like this:

    $ciconia->setLinkHandler(function($in) {
    
       return $in . '.md';
    
    });
    

    But I would also settle with a simple baseUrl and let Ciconia do the heavy lifting :)

    opened by evert 3
  • How to disable certain markdown?

    How to disable certain markdown?

    I would like to disable certain markdown from the core, for example, I don't want users to post images, I can go ahead and not document that markdown but if a user is little bit technical and knows about markdown can easily use them...

    I would like to disable then the following markdown:

    ![Alt text](/path/to/image.png)
    

    Is there any option to do this? I haven't found anything...

    opened by t3chn0r 2
  • Question about weird error

    Question about weird error

    @kzykhys, this is more a question than a bug.

    I have a kinda large table in GFM with code tags inside its cells that breaks my local Apache Server when I parse it.

    This doesn't happens in "Try Ciconia" nor my prod server, but I've already reproduced it in 3 completely different dev machines with the same results.

    Do you have any idea about what could it be?

    This is the Markdown:

    Table
    -----
    
    Atributo       | Tipo      | Notas
    --             | --        | --
    id             | Integer   | |
    code           | String    | |
    subcode        | String    | |
    description    | String    | |
    status         | Integer   | Uno de: `0` (pendiente), `1` (disponible), `2` (terminada), `3` (cancelada), `4` (vencida).
    type           | Integer   | Uno de: `0` (normal), `1` (encuesta), `2` (supervisión).
    priority       | Integer   | Del 1 al 5, siendo 5 la prioridad más alta.
    street         | String    | |
    district       | String    | |
    zipcode        | String    | |
    city           | String    | |
    state          | String    | |
    country        | String    | |
    address        | String    | Dirección estilizada para mostrar.
    latitude       | Decimal   | |
    longitude      | Decimal   | |
    form_id        | Integer   | |
    group_id       | Integer   | |
    created_at     | Timestamp | |
    updated_at     | Timestamp | |
    available_at   | Timestamp | |
    expires_at     | Timestamp | |
    started_at     | Timestamp | |
    finished_at    | Timestamp | |
    received_at    | Timestamp | |
    location_id    | Integer   | |
    distance       | Integer   | Distancia en metros a la que se realizó la visita.
    timespan       | Integer   | Duración en minutos de la visita.
    alarms         | Integer   | |
    supervising_id | Integer   | El id de la visita que se está supervisando.
    supervision    | Integer   | Uno de: `null` (sin supervisar), `0` (en supervisión), `1` (aceptada), `2` (corregida), `3` (rechazada).
    version        | Integer   | |
    

    It just "break" the server and the Apache logs says this:

    [mpm_winnt:notice] [pid 5668:tid 468] AH00428: Parent: child process exited with status 3221225725 -- Restarting.
    

    I'm pretty sure that the Markdown is fine.

    BTW, you can try any of this and suddenly the parser will work again:

    • Remove one (any) of the lines with more than one code tag in the last column.
    • Remove two or more of any of the other lines.

    I've tried to find the last piece of code in Ciconia that's beign executed, but honestly I was unable to track it.

    Any hints?

    opened by joelcuevas 2
  • Parsing error in

    Parsing error in "Try Ciconia"

    In "Try Ciconia" at http://ciconia.kzykhys.com/.

    Input:

    Look at: `http://someurl.com` and text.
    

    Output:

    <p>Look at: <code><a href="http://someurl.com&lt;/cod">http://someurl.come&gt; and text.</a></code></p>
    
    opened by joelcuevas 2
  • Feature: support for GFM anchors

    Feature: support for GFM anchors

    Support for GFM-style auto-generated anchor-tags is missing - for example, ## Opinionated in GFM generates a <a> tag with an auto-generated name attribute, e.g.:

    <h2>
        <a id="user-content-opinionated" class="anchor" href="#opinionated" aria-hidden="true">
            <span class="octicon octicon-link"></span>
        </a>
        Opinionated
    </h2>
    

    Is support for GFM deliberately partial in Ciconia?

    If so, it might be a good idea to clarify this in the documentation - I got the impression that GFM was fully-supported, but it appears to be partial? It would be good to list in the README not just which features are supported, but which ones are unsupported.

    opened by mindplay-dk 0
  • GFM whitespace extension creates too many <br> tags

    GFM whitespace extension creates too many
    tags

    I'm not 100% sure how it should work, but I've been rendering this document with Ciconia, and the whitespace behavior is inconsistent with that of GitHub.

    Look at the leading paragraph, which ends in the words, "with a gentle learning curve" - there's an extra line break inserted there.

    Then look at the source document - there is a line-break before those words, but it doesn't render as a <br> on GitHub.

    If I comment out $engine->addExtension(new Gfm\WhiteSpaceExtension()), it looks more like GitHub.

    What gives?

    opened by mindplay-dk 0
  • Duplicate tests in the test-suite

    Duplicate tests in the test-suite

    I can understand duplicating certain tests when there is overlap between two test-suites:

    2 duplicates:
    - kzykhys/ciconia/test/Ciconia/Resources/core/em-spaces.md|out
    - kzykhys/ciconia/test/Ciconia/Resources/gfm/em-spaces.md|out
    2 duplicates:
    - kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/list-multiparagraphs.md|out
    - kzykhys/ciconia/test/Ciconia/Resources/gfm/ws-list-multiparagraphs.md|out
    2 duplicates:
    - kzykhys/ciconia/test/Ciconia/Resources/gfm/table-invalid-body.md|out
    - kzykhys/ciconia/test/Ciconia/Resources/options/strict/gfm/table-invalid-body.md|out
    

    But what's the purpose of tests duplicated within the same test-suite?

    2 duplicates:
    - kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/EOL-CR+LF.md|out
    - kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/EOL-LF.md|out
    2 duplicates:
    - kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/inline-code-with-visible-backtick.md|out
    - kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/inline-code.md|out
    2 duplicates:
    - kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/unordered-list-items-leading-1space.md|out
    - kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/unordered-list-items.md|out
    

    I inspected the EOL-CR+LF.md and EOL-LF.md in a hex editor as per your note, and these files are identical, as far as I can tell - and they both appear (on github.com) to be precisely 119 bytes, which, if one has CR+LF bytes for line breaks, it should have slightly more bytes than the one that has only LF bytes, correct?

    Perhaps your IDE or some other tool, at some point, cleaned up what was deliberately intended to be leading/trailing space for test purposes?

    I hope this is helpful :-)

    opened by mindplay-dk 1
  • Consecutive tables - only first table is parsed

    Consecutive tables - only first table is parsed

    Two consecutive tables

    | head | head |
    |-------|-------|
    | body | body |
    
    | head | head |
    |-------|-------|
    | body | body |
    

    are converted to

    <table>
    <thead>
    <tr>
    <th>head</th>
    <th>head</th>
    </tr>
    </thead>
    <tbody>
    <tr>
    <td>body</td>
    <td>body</td>
    </tr>
    </tbody>
    </table>
    
    <p>| head | head |<br>
    |-------|-------|<br>
    | body | body |</p>
    
    opened by zdenekdrahos 0
  • Parsing bug

    Parsing bug

    Hi, first, thank for Ciconia. Nice project! I have problems with the compilation of html markup and markdown, it's possible?

    <nav class="class">
    
    * [Item <span class="sep">›</span>](/)
    * [Item <span class="sep">›</span>](/link)
    * [Item](/link)
    
    </nav>
    
    # lorem ipsum
    

    this compiles

    <p><nav class="class"></p>
    
    <ul>
    <li><a href="/">Item <span class="sep">›</span></a></li>
    <li><a href="/link">Item <span class="sep">›</span></a></li>
    <li><a href="/link">Item</a></li>
    </ul>
    
    <p></nav></p>
    
    <h1>lorem ipsum</h1>
    

    it may create paragraphs between markup in html? Thanks

    opened by plasm 0
Releases(v1.0.3)
Better Markdown Parser in PHP

Parsedown Better Markdown Parser in PHP - Demo. Features One File No Dependencies Super Fast Extensible GitHub flavored Tested in 5.3 to 7.3 Markdown

Emanuil Rusev 14.3k Jan 8, 2023
Highly-extensible PHP Markdown parser which fully supports the CommonMark and GFM specs.

league/commonmark league/commonmark is a highly-extensible PHP Markdown parser created by Colin O'Dell which supports the full CommonMark spec and Git

The League of Extraordinary Packages 2.4k Jan 1, 2023
A super fast, highly extensible markdown parser for PHP

A super fast, highly extensible markdown parser for PHP What is this? A set of PHP classes, each representing a Markdown flavor, and a command line to

Carsten Brandt 989 Dec 16, 2022
php html parser,类似与PHP Simple HTML DOM Parser,但是比它快好几倍

HtmlParser php html解析工具,类似与PHP Simple HTML DOM Parser。 由于基于php模块dom,所以在解析html时的效率比 PHP Simple HTML DOM Parser 快好几倍。 注意:html代码必须是utf-8编码字符,如果不是请转成utf-8

俊杰jerry 522 Dec 29, 2022
Convert HTML to Markdown with PHP

HTML To Markdown for PHP Library which converts HTML to Markdown for your sanity and convenience. Requires: PHP 7.2+ Lead Developer: @colinodell Origi

The League of Extraordinary Packages 1.5k Dec 28, 2022
Plug and play flat file markdown blog for your Laravel-projects

Ampersand Plug-and-play flat file markdown blog tool for your Laravel-project. Create an article or blog-section on your site without the hassle of se

Marcus Olsson 22 Dec 5, 2022
UpToDocs scans a Markdown file for PHP code blocks, and executes each one in a separate process.

UpToDocs UpToDocs scans a Markdown file for PHP code blocks, and executes each one in a separate process. Include this in your CI workflows, to make s

Mathias Verraes 56 Nov 26, 2022
An HTML5 parser and serializer for PHP.

HTML5-PHP HTML5 is a standards-compliant HTML5 parser and writer written entirely in PHP. It is stable and used in many production websites, and has w

null 1.2k Dec 31, 2022
📜 Modern Simple HTML DOM Parser for PHP

?? Simple Html Dom Parser for PHP A HTML DOM parser written in PHP - let you manipulate HTML in a very easy way! This is a fork of PHP Simple HTML DOM

Lars Moelleken 665 Jan 4, 2023
Advanced shortcode (BBCode) parser and engine for PHP

Shortcode Shortcode is a framework agnostic PHP library allowing to find, extract and process text fragments called "shortcodes" or "BBCodes". Example

Tomasz Kowalczyk 358 Nov 26, 2022
A lightweight lexical string parser for BBCode styled markup.

Decoda A lightweight lexical string parser for BBCode styled markup. Requirements PHP 5.6.0+ Multibyte Composer Contributors "Marten-Plain" emoticons

Miles Johnson 194 Dec 27, 2022
Parsica - PHP Parser Combinators - The easiest way to build robust parsers.

Parsica The easiest way to build robust parsers in PHP.

null 0 Feb 22, 2022
Simple URL parser

urlparser Simple URL parser This is a simple URL parser, which returns an array of results from url of kind /module/controller/param1:value/param2:val

null 1 Oct 29, 2021
This is a php parser for plantuml source file.

PlantUML parser for PHP Overview This package builds AST of class definitions from plantuml files. This package works only with php. Installation Via

Tasuku Yamashita 5 May 29, 2022
Efficient, easy-to-use, and fast PHP JSON stream parser

JSON Machine Very easy to use and memory efficient drop-in replacement for inefficient iteration of big JSON files or streams for PHP 5.6+. See TL;DR.

Filip Halaxa 801 Dec 28, 2022
This is a simple, streaming parser for processing large JSON documents

Streaming JSON parser for PHP This is a simple, streaming parser for processing large JSON documents. Use it for parsing very large JSON documents to

Salsify 687 Jan 4, 2023
A PHP hold'em range parser

mattjmattj/holdem-range-parser A PHP hold'em range parser Installation No published package yet, so you'll have to clone the project manually, or add

Matthias Jouan 1 Feb 2, 2022
Parser for Markdown and Markdown Extra derived from the original Markdown.pl by John Gruber.

PHP Markdown PHP Markdown Lib 1.9.0 - 1 Dec 2019 by Michel Fortin https://michelf.ca/ based on Markdown by John Gruber https://daringfireball.net/ Int

Michel Fortin 3.3k Jan 1, 2023
Parser for Markdown and Markdown Extra derived from the original Markdown.pl by John Gruber.

PHP Markdown PHP Markdown Lib 1.9.0 - 1 Dec 2019 by Michel Fortin https://michelf.ca/ based on Markdown by John Gruber https://daringfireball.net/ Int

Michel Fortin 3.3k Jan 1, 2023
A lightweight nearly-zero-configuration object-relational mapper and fluent query builder for PHP5.

Idiorm http://j4mie.github.com/idiormandparis/ Feature/API complete Idiorm is now considered to be feature complete as of version 1.5.0. Whilst it wil

Jamie Matthews 2k Dec 27, 2022