Reply to comment

WordPress: How to use get_shortcode_regex()

get_shortcode_regex() (GSR() from now on) is used to parse shortcodes from a post's text.

I was writing a filter to take the post text, parse the shortcodes, and modify them by adding an "id" parameter.

After I spent some time writing a regex to parse the shortcodes, I discovered GSR(). GSR() was better and more complete.

Now I just had to learn to use it - and there weren't any docs.

Let's Review How to Use Shortcodes

You've basically got five ways to use shortcodes:

[[fe-escape]]
[[fe-escape] ... [/fe-escape]]
[fe-single /]
[fe-single]
[fe-single title="foobar" name="janedoe"]Header Content[/fe-single]

The first two, fe-escape, are both escaped so that the raw tags are displayed.

The last three are the normal usages.

[fe-single /]  # A standalone shortcode.

[fe-single]  # Another standalone shortcode.

[fe-single title="foobar" name="janedoe"]Header Content[/fe-single]
# A shortcode with attributes, and wrapping content.

Now Some Code to Dig Around WordPress

In functions.php (in my child theme) I added this code:

function fe_set_ids($text) {
    $regex = get_shortcode_regex();
    $count = preg_match_all( '/'.$regex.'/s', $text, $matches, PREG_SET_ORDER );
    if ($count==0) return $text;

    throw new \Exception();
    return $text;
}
add_filter( 'content_save_pre', 'fe_set_ids' );

function fe_single_shortcode() { }
function fe_escape_shortcode() { }
add_shortcode( 'fe-single', 'fe_single_shortcode' );
add_shortcode( 'fe-escape', 'fe_escape_shortcode' );

That code doesn't do anything. It's there to accept the text of a post, parse it with the GSR(), and then throw an exception so we get the debugging output.

We specify PREG_SET_ORDER so the regex returns one element for each match. If we match nothing, we just pass the text through.

    $count = preg_match_all( '/'.$regex.'/s', $text, $matches, PREG_SET_ORDER );
    if ($count==0) return $text;

The GSR() matches only existing shortcodes, so we need to define them:

function fe_single_shortcode() { }
function fe_escape_shortcode() { }
add_shortcode( 'fe-single', 'fe_single_shortcode' );
add_shortcode( 'fe-escape', 'fe_escape_shortcode' );

They don't do anything.

Time to Blow It Up

Fire up WordPress, log in, and create a new Page. Paste in this code into the text:

[[fe-escape] ... [/fe-escape]]
[[fe-escape]]
[fe-single /]
[fe-single title="foobar" name="janedoe"]Header Content[/fe-single]

For best results, click the "Text" tab first, then paste in the code.

We're only using four shortcodes because the parser messes up when you have both the single and paired "fe-single" shortcodes next to each other.

When you hit "Update" the system barfs out some debugging info, including $matches:

array (size=4)
  0 => 
    array (size=7)
      0 => string '[[fe-escape] ... [/fe-escape]]' (length=30)
      1 => string '[' (length=1)
      2 => string 'fe-escape' (length=9)
      3 => string '' (length=0)
      4 => string '' (length=0)
      5 => string ' ... ' (length=5)
      6 => string ']' (length=1)
  1 => 
    array (size=7)
      0 => string '[[fe-escape]]' (length=13)
      1 => string '[' (length=1)
      2 => string 'fe-escape' (length=9)
      3 => string '' (length=0)
      4 => string '' (length=0)
      5 => string '' (length=0)
      6 => string ']' (length=1)
  2 => 
    array (size=7)
      0 => string '[fe-single /]' (length=13)
      1 => string '' (length=0)
      2 => string 'fe-single' (length=9)
      3 => string ' ' (length=1)
      4 => string '/' (length=1)
      5 => string '' (length=0)
      6 => string '' (length=0)
  3 => 
    array (size=7)
      0 => string '[fe-single title=\"foobar\" name=\"janedoe\"]Header Content[/fe-single]' (length=71)
      1 => string '' (length=0)
      2 => string 'fe-single' (length=9)
      3 => string ' title=\"foobar\" name=\"janedoe\"' (length=34)
      4 => string '' (length=0)
      5 => string 'Header Content' (length=14)
      6 => string '' (length=0)

Let's see what's up here.

In each array, you have 6 elements, each one partially deconstructing the shortcode.

Element 0 is the completely match (as usual).

Element 1 is '[' if the shortcode is escaped. This is paired with element 6, which is the matching closing escape character.

Element 2 is the shortcode.

Element 3 is a string with the attributes. Look at the fourth match to see this.

Element 4 is "/" if it's a self-closing shortcode. See the third match to see this.

Element 5 is the content that's wrapped by the shortcode.

Parsing the Attributes

The shortcode_parse_atts() (SPA() from now on) parse element 3 and returns an array of attributes.

The code's hacked to throw and exception when it finds an attribute.

function fe_set_ids($text) {
    $text = stripcslashes($text);
    $regex = get_shortcode_regex();
    $count = preg_match_all( '/'.$regex.'/s', $text, $matches, PREG_SET_ORDER );
    if ($count==0) return addslashes($text);

    foreach($matches as $m) {
        $atts = shortcode_parse_atts( $m[3] );
        if ($atts) {
            throw new \Exception();
        }
    }

    return addslashes($text);
}
add_filter( 'content_save_pre', 'fe_set_ids' );

It looked like the text coming in had C-style escapes, so I used stripcslashes() to remove them. That also meant that I needed to re-introduce them before returning the text.

And the dump includes this for $atts:

array (size=2)
  'title' => string 'foobar' (length=6)
  'name' => string 'janedoe' (length=7)

If you don't stripcslashes() to the input, the regex won't match because you have backslashes in front of the doublequote character: title=\"foobar\"

So, there you have it. The rest of the code isn't shown here, but it performs a simple search-and-replace on shortcodes that don't have an "id='some-randome-value'" attribute, and adds the attribute.

Once that's done, it's possible to programmatically refer to a specific shortcode in a post, and manipulate it.

Reply

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.

More information about formatting options

4 + 2 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.