{"id":1177,"date":"2010-11-23T15:44:19","date_gmt":"2010-11-23T14:44:19","guid":{"rendered":"http:\/\/glandium.org\/blog\/?p=1177"},"modified":"2010-12-04T12:36:28","modified_gmt":"2010-12-04T11:36:28","slug":"improving-libxul-startup-io-by-hacking-the-elf-format","status":"publish","type":"post","link":"https:\/\/glandium.org\/blog\/?p=1177","title":{"rendered":"Improving libxul startup I\/O by hacking the ELF format"},"content":{"rendered":"<p>If you have been following this blog during the past months, you might have gathered that I've been focusing on I\/O patterns on library initialization, more specifically <code>libxul.so<\/code>. The very patterns there's not much to do about, except enduring them, or hacking <code>ld.so<\/code> and\/or the toolchain. The latter is not necessarily easy, and won't benefit everyone until all linux distros use an improved glibc and\/or toolchain. This could take years before any actual result would reach end users.<\/p>\n<p>But with a little creativity, one can overcome these limitations and improve the situation today. Here is a summary of my recent hacks.<\/p>\n<p><a name=\"static-initializers\"><\/a><\/p>\n<h2>Backwards static initializers<\/h2>\n<p>As <a href=\"http:\/\/blog.mozilla.com\/tglek\/2010\/05\/27\/startup-backward-constructors\/\">Taras showed in the past<\/a>, static initializers are executed in reverse order. It turns out it is an unfortunate design choice in <code>gcc<\/code>, and it also turns out not to impact all ELF binary files.<\/p>\n<p>As far as I know there are currently two different implementations of the static initializers call loop, one provided by <code>gcc<\/code>, and one provided by <code>ld.so<\/code>.<\/p>\n<p>In the <code>gcc<\/code> case, the compiler creates a <code>.ctors<\/code> section, starting with <code>0xffffffff<\/code>, ending with <code>0x00000000<\/code>, and filled with pointers to the various static initializer functions, in ascending order. The compiler also injects an initializer function, called from <code>.init<\/code>, going through this <code>.ctors<\/code> section, starting from the end (<code>__do_global_ctors_aux<\/code> in <code>crtend.o<\/code>). A similar <code>.dtors<\/code> section is created for functions to run when the library is unloaded, and the pointers in it are called in ascending order. There is actually no importance in the order in which <code>.ctors<\/code> and <code>.dtors<\/code> are executed. All that matters is that they are executed in reverse order from each other, because that limits the risks of bad ordering. It is only unfortunate that <code>.ctors<\/code> was chosen to be the one backwards.<\/p>\n<p>In the <code>ld.so<\/code> case, the compiler creates a <code>.init_array<\/code> section, filled with pointers to the various static initializer functions, in ascending order. It also stores the location and the size of this section in the <code>PT_DYNAMIC<\/code> ELF header. At runtime, <code>ld.so<\/code> itself reads this information and executes the static initializers in ascending orders. A similar <code>.fini_array<\/code> section is created for functions to run when the library in unloaded, and <code>ld.so<\/code> goes through it in descending order.<\/p>\n<p>The <code>.init_array<\/code> section is newer, and for backwards compatibility reasons, is not used on systems that have been using the <code>.ctors<\/code> approach forever. On the other hand, some newer ABIs, such as arm EABI, uses the new facility, which means they are not victim of backwards static initialization. I however wonder if backwards compatibility is still needed, considering how long <code>.init_array<\/code> support has been around (more than 11 years, according to <a href=\"http:\/\/sourceware.org\/git\/?p=glibc.git;a=commitdiff;h=fcf70d4114db9ff7923f5dfeb3fea6e2d623e5c2;hp=3f3822198993be18d4d9ccb1593eea274dbd2ba0\">glibc git repository<\/a>).<\/p>\n<p>While we could obviously bug the <code>gcc<\/code> maintainers to revert their <code>.ctors<\/code> execution order, or ditch <code>.ctors<\/code> support on glibc systems, which means we wouldn't get any immediate global benefit, we can also just <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=606137\">revert the <code>.ctors<\/code> section content<\/a>. Please note the source code out there doesn't take care of reverting the <code>.dtors<\/code> section accordingly, because it is currently empty in <code>libxul.so<\/code>.<\/p>\n<p><a name=\"relocations\"><\/a><\/p>\n<h2>Relocations<\/h2>\n<p>As I mentioned in an <a href=\"\/blog\/?p=1016\">earlier post<\/a>, relocations are what make it possible to load libraries at arbitrary locations. Unfortunately, a great number of relocations means awful I\/O patterns: <code>ld.so<\/code> reads and applies them one by one, and while the kernel reads ahead by 128 KB chunks, when your relocation section exceeds that amount by more than an order of magnitude, it means I\/O goes back and forth a lot between the relocation section and where the relocations take place.<\/p>\n<p>There are two classes of relocations: REL and RELA. In both cases, there are several types of relocations, some of which require an addend. The difference between both classes is in REL relocations, the addend is stored at the offset the relocation takes place, and in RELA relocations, the addend is stored in the relocation entry. It means that RELA relocations effectively waste space, as the addend ends up actually being stored twice: it is also stored at the offset the relocation takes place, like with REL relocations, though it could technically be nulled out, there ; in any case, the space is wasted.<\/p>\n<p>REL relocations thus take 2 words of storage, and RELA relocations, 3. On x86, the ELF ABI only uses REL relocations, and words are 32 bits, making each relocation take 8 bytes ; on x86-64, the ELF ABI only uses RELA relocations, and words are 64 bits, thus making each relocation take 24 bytes, 3 times that of x86 relocations. On ARM, both may be used, but apparently the compiler and the linker only use REL relocations.<\/p>\n<p>These words of storage are used to store 3 or 4 bits of informations:<\/p>\n<ul>\n<li><code>r_offset<\/code>: the offset at which the relocation takes place<\/li>\n<li><code>r_info<\/code>: containing both the relocation type and a symbol reference for the relocation. On 32 bits systems, the symbol reference is stored on 24 bits, and the relocation type on 8 bits. On 64 bits systems, the symbol reference is stored on 32 bits, and the relocation type on 32 bits (which means at least 24 bits are never going to be used: there are currently no ABI using more than 256 relocation types).<\/li>\n<li><code>r_addend<\/code>: The addend, only in RELA relocations<\/li>\n<\/ul>\n<p>The linker splits relocations in two main categories: those for dynamic linking at startup and those for procedure linkage table (PLT) resolution at runtime. Here is what the repartition look like on <code>libxul.so<\/code> (taken from the mozilla-central nightly from Nov, the 22nd):<\/p>\n<table class=\"table\">\n<tr>\n<th><\/th>\n<th colspan=\"2\">dynamic relocations<\/th>\n<th colspan=\"2\">PLT relocations<\/th>\n<\/tr>\n<tr>\n<th>Architecture<\/th>\n<th>number<\/th>\n<th>size (bytes)<\/th>\n<th>number<\/th>\n<th>size (bytes)<\/th>\n<\/tr>\n<tr>\n<td>x86<\/td>\n<td>231,836<\/td>\n<td>1,854,688<\/td>\n<td>3,772<\/td>\n<td>30,176<\/td>\n<\/tr>\n<tr>\n<td>x86-64<\/td>\n<td>235,893<\/td>\n<td>5,661,432<\/td>\n<td>3,773<\/td>\n<td>90,528<\/td>\n<\/tr>\n<\/table>\n<p>So, PLT relocations are quite negligible, and dynamic relocations are just huge, especially on x86-64. Compared to the binary size, it's pretty scary:<\/p>\n<table class=\"table\">\n<tr>\n<th>Architecture<\/th>\n<th><code>libxul.so<\/code> size<\/th>\n<th>relocations size<\/th>\n<th>%<\/th>\n<\/tr>\n<tr>\n<td>x86<\/td>\n<td>21,869,684<\/td>\n<td>1,884,864<\/td>\n<td>8.61%<\/td>\n<\/tr>\n<tr>\n<td>x86-64<\/td>\n<td>29,629,040<\/td>\n<td>5,751,984<\/td>\n<td>19.41%<\/td>\n<\/tr>\n<\/table>\n<p>Let me remind you that not only do we need to read these relocations at startup (at least the dynamic ones, but the PLT ones are just negligible), but we only read them by chunks, and need to read other data in between.<\/p>\n<p>Of the many existing types of relocations, only a few are actually used in dynamic libraries. Here is a summary of those in use in <code>libxul.so<\/code>:<\/p>\n<table class=\"table\">\n<tr>\n<th>x86 relocation type<\/th>\n<th>number<\/th>\n<th>x86-64 relocation type<\/th>\n<th>number<\/th>\n<\/tr>\n<tr>\n<th colspan=\"4\" style=\"text-align: center\">dynamic relocations<\/th>\n<\/tr>\n<tr>\n<td><code>R_386_TLS_DTPMOD32<\/code><\/td>\n<td>1<\/td>\n<td><code>R_X86_64_DTPMOD64<\/code><\/td>\n<td>1<\/td>\n<\/tr>\n<tr>\n<td><code>R_386_GLOB_DAT<\/code><\/td>\n<td>213<\/td>\n<td><code>R_X86_64_GLOB_DAT<\/code><\/td>\n<td>238<\/td>\n<\/tr>\n<tr>\n<td><code>R_386_32<\/code><\/td>\n<td>27,643<\/td>\n<td><code>R_X86_64_64<\/code><\/td>\n<td>27,611<\/td>\n<\/tr>\n<tr>\n<td><code>R_386_RELATIVE<\/code><\/td>\n<td>203,979<\/td>\n<td><code>R_X86_64_RELATIVE<\/code><\/td>\n<td>208,043<\/td>\n<\/tr>\n<tr>\n<th colspan=\"4\" style=\"text-align: center\">PLT relocations<\/th>\n<\/tr>\n<tr>\n<td><code>R_386_JUMP_SLOT<\/code><\/td>\n<td>3,772<\/td>\n<td><code>R_X86_64_JUMP_SLOT<\/code><\/td>\n<td>3,773<\/td>\n<\/tr>\n<\/table>\n<p>The above types can be separated in two distinct categories:<\/p>\n<ul>\n<li>Requiring symbols resolution (<code>R_*_GLOB_DAT<\/code>, <code>R_*_JUMP_SLOT<\/code>, <code>R_386_32<\/code>, <code>R_X86_64_64<\/code>)<\/li>\n<li>Not requiring symbols resolution (<code>R_*_RELATIVE<\/code>, <code>R_386_TLS_DTPMOD32<\/code>, <code>R_X86_64_TLS_DTPMOD64<\/code>)<\/li>\n<\/ul>\n<p>In the second category, it means we are wasting all the bits for the symbol reference in each entry: 24 on x86 and 32 on x86-64. Unfortunately, the vast majority of the relocations in <code>libxul.so<\/code> are from that category. At this point, this is how much waste we identified:<\/p>\n<table class=\"table\">\n<tr>\n<th>description<\/th>\n<th>num. entries<\/th>\n<th>waste per entry (bytes)<\/th>\n<th>total<\/th>\n<th>% of relocation size<\/th>\n<\/tr>\n<tr>\n<th style=\"text-align: center\" colspan=\"5\">x86<\/th>\n<\/tr>\n<tr>\n<td>Relocation without symbol ref.<\/td>\n<td>203,980<\/td>\n<td>3<\/td>\n<td>611,937<\/td>\n<td>32.46%<\/td>\n<\/tr>\n<tr>\n<th style=\"text-align: center\" colspan=\"5\">x86-64<\/th>\n<\/tr>\n<tr>\n<td>Duplicated addend<\/td>\n<td>239,666<\/td>\n<td>8<\/td>\n<td>1,917,328<\/td>\n<td>33.33%<\/td>\n<\/tr>\n<tr>\n<td>Relocation type encoded on 32 bits<\/td>\n<td>239,666<\/td>\n<td>3<\/td>\n<td>718,998<\/td>\n<td>12.49%<\/td>\n<\/tr>\n<tr>\n<td>Relocation without symbol ref.<\/td>\n<td>208,044<\/td>\n<td>4<\/td>\n<td>832,176<\/td>\n<td>14.46%<\/td>\n<\/tr>\n<tr>\n<td colspan=\"3\">Total<\/td>\n<td>3,468,502<\/td>\n<td>60.30%<\/td>\n<\/tr>\n<\/table>\n<p>On top of that, one could argue that we don't need 24 or 32 bits for symbol reference (which is the entry number in the symbol table), and that we're really far from requiring 64 bits for <code>r_offset<\/code> on 64 bits systems, as <code>libxul.so<\/code> is far from being bigger than 4GB. Anyways, that means we have room for improvement here, and it means that the ELF format, especially on 64 bits systems with RELA relocations, really doesn't help with big programs or libraries with a lot of relocations. (For that matter, the mach-o format used on OS-X has some kind of similar issues ; I don't know for the DLL format)<\/p>\n<p>Another interesting data point is looking at how relocations' <code>r_offset<\/code> are following each other, once relocations are reordered (to avoid too much symbol resolution, they are actually grouped by symbol). It turns out 206,739 relocations' <code>r_offset<\/code> are at 4 bytes from the previous one on x86 (i.e. strictly following it), and 210,364 at 8 bytes from the previous on x86-64. The main reason why this happens is that most relocations are happening in pointer tables, and <code>libxul.so<\/code> has lots of them:<\/p>\n<ul>\n<li>That's what you get with virtual methods in C++.<\/li>\n<li>That's also what you get in <code>.ctors<\/code> and <code>.dtors<\/code> sections, which are relocated as well (but not <code>.init_array<\/code> and <code>.fini_array<\/code> sections).<\/li>\n<li>That's what you get when you use function tables.<\/li>\n<li>Obvously, that's what you get when you use pointer tables.<\/li>\n<\/ul>\n<p>And when I say <code>libxul.so<\/code> has lots of them, that's not an understatement: there are 184,320 relocations due to vtables (function tables you get when using virtual methods in C++) on x86 and 184,098 on x86-64. That's respectively 78.23% and 76.81% of the total number of relocations. Interestingly, while most of these relocations are <code>R_*_RELATIVE<\/code>, 26,853 are <code>R_386_32<\/code>, and 26,829 are <code>R_X86_64_64<\/code>. Except for a very few, these non relative relocations are for the <code>__cxa_pure_virtual<\/code> symbol, which is a dummy function that does nothing, used for pure virtual methods, such as the following:<\/p>\n<blockquote><p><code>class Abstract {<br \/>\npublic:<br \/>\n   virtual void pure_virtual() = 0;<br \/>\n};<\/code><\/p><\/blockquote>\n<p>As for the remaining relative relocations, there are a few pointer tables responsible for several thousand relocations, such as <code>JSString::length2StringTable<\/code>, <code>gEntries<\/code>, <code>GkAtoms_info<\/code>, <code>Html5Atoms_info<\/code>, <code>kUnixCharsets<\/code>, <code>nsIDOMCSS2Properties_properties<\/code>, etc.<\/p>\n<p><a name=\"packing-relocations\"><\/a><\/p>\n<h2>Packing relocations<\/h2>\n<p>While they are not detrimental to most binaries, we saw that relocations can have a bad effect on startup I\/O for big binaries. Reducing how much space they take would have a direct impact on these startup I\/Os. One way to do so is obviously to reduce the number of relocations. But as we saw above, the vast majority are due to vtables. While reducing vtables would be a nice goal, it's a long way before this can have a significant impact on the relocations number. So another possible way is to try to avoid wasting so much space in relocations.<\/p>\n<p>The main problem is that relocations are applied by <code>ld.so<\/code>, and that, as mentioned in the introduction, while we can hack <code>ld.so<\/code> to do what we want, it won't work everywhere before a long time. Moreover, it's difficult to test various implementations, and binary compatibility needs to be guaranteed once something is applied in the upstream libc.<\/p>\n<p>The idea I started to test a few weeks ago, is to effectively apply relocations by hand instead of relying on <code>ld.so<\/code> doing it. Considering the first thing <code>ld.so<\/code> does with a library after applying relocations is to call its <code>DT_INIT<\/code> function, the idea was to replace the dynamic relocations we can with a new <code>DT_INIT<\/code> function we would inject that'd apply those relocations, while storing them in a more efficient manner. The function would obviously need to call the original <code>DT_INIT<\/code> function once done. Some parts of the <code>PT_DYNAMIC<\/code> section also obviously need to be adjusted accordingly.<\/p>\n<p>The first iteration I tested a few weeks ago helped validating the idea, but didn't change much of the binary: the dynamic relocation section still was taking all the space it took before, though its content was mostly empty (we'll see further below how efficient even the current dumb format is). Once the theory that an injected <code>DT_INIT<\/code> function could apply the relocations without breaking everything, I went on to implement a program that would actually strip the ELF binary. I won't get too much into the gory details, but it required <code>PT_LOAD<\/code> splitting, thus program headers growth, which means moving the following sections, then removing a huge part of the relocations, shrinking the relocations section, injecting our code and the packed relocations as sections, and moving the remaining sections after these new sections, to avoid empty space, but still respecting the proper offsets for both <code>PT_LOAD<\/code> and the code itself.<\/p>\n<p>So far, I only took care of the <code>R_*_RELATIVE<\/code> relocations. Some but not all other relocations could be taken care of, too, but everything involving symbol resolution from other libraries would be really difficult to handle from the injected code. The relocations are currently stored with a quite dumb format:<\/p>\n<ul>\n<li>a 32 bits word containing <code>r_offset<\/code><\/li>\n<li>a 32 bits word containing the number of relocations to apply consecutively<\/li>\n<\/ul>\n<p>As we only take care of one type of relocations, we can skip any information about the type. Also, the addend already being stored at <code>r_offset<\/code>, we just strip it from RELA relocations ; the binary being far from 4GB big, 32 bits is enough on 64 bits systems too. We also saw that a lot of relocations were next to the following ones, so instead of storing <code>r_offset<\/code> for each one, storing some kind of range is more efficient. As I've so far been focusing on getting something that actually works more than on getting something really efficient, I only tested this format, but the results are already nice:<\/p>\n<table class=\"table\">\n<tr>\n<th><\/th>\n<th>x86<\/th>\n<th>x86-64<\/th>\n<\/tr>\n<tr>\n<th colspan=\"3\" style=\"text-align: center\">before<\/th>\n<\/tr>\n<tr>\n<th>dynamic relocations (bytes)<\/th>\n<td>1,854,688<\/td>\n<td>5,661,432<\/td>\n<\/tr>\n<tr>\n<th><code>libxul.so<\/code> size<\/th>\n<td>21,869,684<\/td>\n<td>29,629,040<\/td>\n<\/tr>\n<tr>\n<th colspan=\"3\" style=\"text-align: center\">after<\/th>\n<\/tr>\n<tr>\n<th>dynamic relocations (bytes)<\/th>\n<td>222,856<\/td>\n<td>668,400<\/td>\n<\/tr>\n<tr>\n<th>injected code size<\/th>\n<td>129<\/td>\n<td>84<\/td>\n<\/tr>\n<tr>\n<th>packed relative relocations (bytes)<\/th>\n<td>224,816<\/td>\n<td>228,568<\/td>\n<\/tr>\n<tr>\n<th><code>libxul.so<\/code> size<\/th>\n<td>20,464,836 (-6.42%)<\/td>\n<td>24,865,520 (-16.07%)<\/td>\n<\/tr>\n<\/table>\n<p>Getting rid of the <code>__cxa_pure_virtual<\/code> relocations would help getting this even further down, and an even better storage format could be used, too. Gathering new <a href=\"\/blog\/?p=1105\">startup coverage data<\/a> should be interesting as well.<\/p>\n<p>The code is <a href=\"http:\/\/hg.mozilla.org\/users\/mh_glandium.org\/elfhack\/\">available via mercurial<\/a>. Please note the <code>Makefile<\/code> deeply sucks, and that the code is in the middle of its third refactoring. Anyways, once you get the code to compile, you just need to run <code>test-x86<\/code> or <code>test-x64<\/code> depending on your architecture, from the build directory, and give it the <code>libxul.so<\/code> file as an argument. It will create a <code>libxul.so.new<\/code> file that you can use to replace <code>libxul.so<\/code>. You may successfully use the tool on other libraries, but chances are you'll bump into an <code>assert()<\/code>, or in the worst case, that the resulting library ends up broken. The tool might work on PIE executables, but most probably not on other executables. If you wonder, Chrome is not PIE. It's not even relocatable (which means ASLR won't work with it) ; or at least the x86-64 binaries I just got aren't.<\/p>\n<h2>More possible ELF hacking<\/h2>\n<p>The code I wrote to pack relocations allows to do various tricks on ELF files, and it opens various opportunities, such as:<\/p>\n<ul>\n<li>Aligning <code>.bss<\/code>. As I <a href=\"\/blog\/?p=1016\">wrote on this blog a while ago<\/a>, <code>ld.so<\/code> fills the last page from the <code>.data<\/code> section with zeroes, starting at the offset of the <code>.bss<\/code> section. Unfortunately, this means reading ahead at most 128KB of data nobody needs, and the corresponding disk seek. Inserting a section full of zeroes between <code>.data<\/code> and <code>.bss<\/code> would get rid of that, while minimally wasting space (less than 4KB in a 20MB+ binary).<\/li>\n<li>Trying to move the <code>PT_DYNAMIC<\/code> section. It is one of the first bunch of data to be read from programs and libraries, yet it is near the end of the binary.<\/li>\n<li>Replacing <code>.ctors<\/code>\/<code>.dtors<\/code> with <code>.init_array<\/code>\/<code>.fini_array<\/code> (thus removing the corresponding relocations, though there's only around 300 of them).<\/li>\n<li>Replacing the <code>__cxa_pure_virtual<\/code> relocations with relative relocations pointing to an injected dummy function (it could even just point to any <code>ret<\/code> code in any existing function).<\/li>\n<li>Replacing some internal symbol resolutions with relative relocations (which should be equivalent to building with <code>-Bsymbolic<\/code>)<\/li>\n<li>etc.<\/li>\n<\/ul>\n<p>Some of the above could probably be done with linker scripts, but using linker scripts is unnecessarily over complicated: as soon as you start using a linker script to fiddle with sections, you need to reimplement the entire default linker script (the one you can see with <code>ld --verbose<\/code>). And it's not guaranteed to work (Taras told me he had a hard time failing to align the <code>.bss<\/code> section). And as far as I know, you can't use the same thing depending whether you use <code>ld<\/code> or <code>gold<\/code> (AIUI, <code>gold<\/code> doesn't support the whole <code>ld<\/code> linker script syntax).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you have been following this blog during the past months, you might have gathered that I&#8217;ve been focusing on I\/O patterns on library initialization, more specifically libxul.so. The very patterns there&#8217;s not much to do about, except enduring them, or hacking ld.so and\/or the toolchain. The latter is not necessarily easy, and won&#8217;t benefit [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7,25],"tags":[23],"class_list":["post-1177","post","type-post","status-publish","format-standard","hentry","category-firefox","category-planet-mozilla","tag-en"],"_links":{"self":[{"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1177","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1177"}],"version-history":[{"count":70,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1177\/revisions"}],"predecessor-version":[{"id":1260,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1177\/revisions\/1260"}],"wp:attachment":[{"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1177"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1177"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1177"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}