PHP preg_split()


preg_split() cuts a string based on a separator which supports regular expression. It is similar to explode() which do not supports regular expression. preg_split() is much slower than explode().

<?PHP
	$str="<td>One<td>two<td>three<td>four";
	$arr=preg_split('/<td>/',$str);
	foreach($arr as $element) echo "$element, "; //, One, two, three, four, 
?>

preg_split grammer is preg_split(pattern,string,limit=-1,flags=0). If parameter limit is provided, only substrings upstream the limit position will be cut, and the downstream remaining substring will be the last element of the return array. There are three flags options:

  • PREG_SPLIT_NO_EMPTY: Only non empty elements will be returned
  • PREG_SPLIT_OFFSET_CAPTURE: Returns substrings as well as their offsets
  • PREG_SPLIT_DELIM_CAPTURE: Parenthesized expression in the separator will be returned also

<?PHP
	$str="<td>One<td>two<td>three<td>four";
	$arr=preg_split('/<td>/',$str,NULL,PREG_SPLIT_NO_EMPTY);
	foreach($arr as $element) echo "$element, "; //One, two, three, four, 
?>

Using PREG_SPLIT_DELIM_CAPTURE to get the parenthesized substring.
<?PHP
	$str="<tr>One<td>two<tr>three<td>four";
	$arr=preg_split('/<(t[rd])>/',$str,NULL,PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
	foreach($arr as $element) echo "$element "; //tr One td two tr three td four
?>

Using PREG_SPLIT_OFFSET_CAPTURE to get the offset of each substring.
<?PHP
	$str="<td>One<td>two<td>three<td>four";
	$arr=preg_split('/<td>/',$str,NULL,PREG_SPLIT_OFFSET_CAPTURE | PREG_SPLIT_NO_EMPTY);
	foreach($arr as $element)
	{
	   echo "$element[0], $element[1];\n"; //One, 4; two, 11; three, 18; four, 27; 
	}
?>